168 lines
4.9 KiB
Markdown
168 lines
4.9 KiB
Markdown
2024-12-14 2-4pm CST
|
|
|
|
myk: small IP considerations due to 15 years ago some work done for a company
|
|
|
|
databases hold on to state
|
|
e.g. a query can be answered by this information
|
|
table ops can be optimized with relational algebra
|
|
p. good ontology
|
|
so why change? what needs improvement?
|
|
e.g. schema changes
|
|
it's possible but it introduces complexity
|
|
ex. different departments in same company, modeling same things differently in parallel
|
|
last write wins
|
|
effects of decisions stack; you're stuck with what's already been done
|
|
|
|
relational means records are defined in reference to other records
|
|
nosql, no schema, can change things easily
|
|
but, hard to produce coherent materialized view
|
|
|
|
at some point data lakes became a thing, which works but means your eng. team is now part of your db.
|
|
|
|
changes cost more over time
|
|
early changes are impactful
|
|
ideally we want a flexible ontology
|
|
|
|
|
|
need to be able to see both the dao and the ten thousand things
|
|
|
|
append only stream of immutable deltas - crdt
|
|
assemble a sequence of deltas to obtain a materialized view
|
|
|
|
-- q: how to index for query optimization?
|
|
a: a succession of more specialized caches / layers of view composition
|
|
|
|
-- q: graph?
|
|
a: hypergraph -- domain nodes never reference each other, only connected by the deltas that reference them
|
|
|
|
now instead of relational tables, that information is pulled into deltas
|
|
|
|
does that make sense, asks myk?
|
|
|
|
-- q: why is it called a delta?
|
|
because it represents a change in state, and metadata associated with that change
|
|
|
|
-- q: how to operationalize the management? to define useful ontoligies and pragmatic procedures for usage
|
|
|
|
-- q: computational objects?
|
|
|
|
databases could subscribe to one another
|
|
|
|
-- q: we've talked about indexing, how are we formulating queries?
|
|
a db is a func that persists information
|
|
a subtype is a query which is a func that returns information
|
|
function application, connect by hyperedges,
|
|
can have a function that's a query edge
|
|
|
|
binding between persistence tier and query tier is loose
|
|
|
|
stuff they tried e.g. storing keys and grouping them, etc... not necessarily what we want
|
|
|
|
|
|
say a db is forked, 2 copies extended separately for a year, then you want to merge them
|
|
|
|
hopping back to computation
|
|
myk's WTF moment about this system
|
|
say we define a function that takes t-shirt specs and places an order, invoking a remote API
|
|
what if I don't have a remote API, just a friend with a t-shirt printer?
|
|
friend can be the implementation of that function
|
|
I just send them a form to complete, with shipping info etc
|
|
Now there's a human in my functional dependency chain
|
|
|
|
Normally with a db there's the notion of canonical source of truth
|
|
this system doesn't have that
|
|
|
|
-- q: trust model, threat model
|
|
|
|
provenance?
|
|
|
|
-- q: distributed system considerations
|
|
|
|
|
|
|
|
what if schemas have diverged?
|
|
in a traditional graph you only get what you select
|
|
|
|
what if, instead of edge nodes, embeddings?
|
|
embedding like in llm context
|
|
e.g. embedding of word "friends"
|
|
gives you the _numbers_ associated with that term
|
|
semantic map
|
|
coordinates in semantic space
|
|
|
|
|
|
-- q: as an attacker could I hijack deltas?
|
|
depends on our trust model
|
|
|
|
|
|
agnostic to implementation
|
|
high level structure
|
|
|
|
strange loop conference
|
|
turning the database inside out
|
|
|
|
-- q: details of content of each delta:
|
|
set of bindings
|
|
assertions targeting particular properties of entities
|
|
each binding gets a name
|
|
|
|
-- q: each delta implies a context?
|
|
|
|
|
|
|
|
suggested path: write a set of assertions
|
|
those serve as the specification
|
|
|
|
initial rollout possibility
|
|
toy model that may or may not grow
|
|
each participant gets their own data store
|
|
should be able to choose to share certain deltas with certain people?
|
|
|
|
-- q:
|
|
|
|
|
|
|
|
store
|
|
retrieve
|
|
send
|
|
receive
|
|
compute
|
|
|
|
you could invoke a function by inserting some deltas into the graph, is a concept we're pointing to
|
|
|
|
e.g. a "publish" function that notifies another user
|
|
can share with the other user so they can activate that behavior
|
|
smalltalk type messaging structure on top of the database
|
|
note dimensions of attack surface
|
|
|
|
layers:
|
|
primitives - what's a delta, schema, materialized view, lossy view
|
|
delta store - allows you to persiste deltas and query over the delta stream
|
|
materialized view store - lossy snapshot(s)
|
|
lossy bindings - e.g. graphql, define what a user looks like, that gets application bindings
|
|
e.g. where your resolvers come in, so that your fields aren't all arrays, i.e. you resolve conflicts
|
|
|
|
-- idea: diff tools
|
|
comparing, merging suggestions
|
|
|
|
-- idea: operations encoded as deltas, that agents can execute
|
|
|
|
|
|
tangent: absential properties
|
|
things that are absent from a model can have significance
|
|
causal absence-- something that is absent but which caused a thing to be the way it is
|
|
|
|
|
|
schema
|
|
we'd keep a bucket of myk as a user, that combinatorially combines associated schemas
|
|
|
|
every network that you have access to is within your query space
|
|
|
|
so... some data types could be more collaborative
|
|
|
|
adding a field to a delta doesn't add it to a materialized view automatically-- but that's a good thing
|
|
|
|
|
|
|
|
|