A Powerful Blockchain-like Approach to Data & Software Behaviors
The target audiences for this blog post are software engineers, data scientists and business leaders interested in a powerful new approach to computing. Essence® addresses foundational issues in computing that affect energy efficiency, processing speed, security, scalability, and more.
In a previous blog post (read it here), we introduced the Essence® Project. Essence® makes ‘Machine Behaviors‘ commodities (made flexible), traded/modified/traded-again in a marketplace or across organizations and use cases. We enable software behaviors to be commoditized by creating Data-Structures that can translate themselves into others, sometimes perfectly and sometimes requiring human assistance ( increasingly via dialogue ) for ambiguities or lack of shared examples.
In our blog post about a Superset approach to AI/ML, we describe how Essence® combines multiple algorithms and executes them in parallel. This opens up new possibilities for distributed ledger-based systems that are complimentary to blockchain, and computing in general.
What Problem Does Translatable Data Structures Solve?
As the amount of data being produced by billions of sensors, services and users around the world increases exponentially, rapidly making sense of the data and ensuring trust are obvious imperatives. The same is true for machine behaviors. What is the device doing? Why is it doing it? What should it do vs. what it is being asked to do? How to get it to do something? How to generate the optimal machine instructions? Does the requestor [user or code] even have a right to know that the objects involved even exist? These are fundamental questions for computing that requires a new level of understanding and transparency in computing. This is especially true in the new era of AI technology.
Essence® is a way to establish meanings and ensuring trusted behaviors of both data and machines, no matter the type of data or device. A core component of systems [both legacy and new software] based on or integrated with Essence® (e.g. apps, operating systems, browsers, drivers, APIs, emulators, virtual machines, media containers, network devices, chipsets, etc.) all obtain the benefits of an advanced blockchain design.
Key Features of Translatable Data Structures
“Translatable Data Structures” let us reorder stored data and runtime processed data-streams (such as lossy compacting 0…1 floats, or reordering SOAs to AOSs, as in padded XYZw, XYZw, XYZw arrays to XXX, YYY, ZZZ, etc.). Here’s why this is such a powerful way to work with data and code:
- Reordering lets us map the same data groups to many different chunks of code as well as the ability to profile and regenerate the actual implementation, its algorithmic units (conventional or AI), and the final machine instructions on the fly.
- Profile and regenerate allows us to self-optimize for the computer’s conditions, such as heavy hard drive traffic from another process or low-battery-mode is toggling on and off, etc. This Essence® design of “Translatable Data Structures, Meaning Implementations, and Code Regeneration” enables Essence® to mix and match code behavior from unrelated domains or even forks of the same project.
- Our design requires Objects to have two patterns: All chunks of data have a Guard, which manages access/trust/security and a means to Synchronize and Coalesce changes/reading-the-correct-values-at-the-correct-time. The Guard validates that the ability to know that a chunk of data even exists is legitimate.
- Internally we refer to uniquely named entities/objects as a ‘Thing’. We use Elixir®, a meaning representation system (not a language), to build meaning units composed of elements. In this case, element #97, whose mnemonic is ‘Dz’. Dz are found via a unique id (like a context-routed URL) and ‘time of access’. The Dz element is defined as: Thing – Named Entity, Existence, Presence, package of Possibilities in an Spacetime existence. Presence describes a container for a Thing, an Existence, whether concrete or abstract. Presence means that its name can be referenced as an Existence. Any Idea that exists can be referenced in an Aptiv (a package containing binaries, data, and instructions for code generation much like an App).
- The Guard is the part of a Dz that has a single representation which determines if any value associated with Dz can be read, written, or even loaded/decompressed/decrypted.
- Uses traditional prime-factor cryptographic keys with an ever-growing suite of algorithms to choose from, which can be combined and implemented in parallel.
- Does not use ‘time of access’ unless it has been altered. Via procedural programming logic (regular Elixir®), the Guard supports many different ‘users’ with varying access types and conditions.
- Generally has little runtime overhead as its part of scheduling a ‘task’ or fulfilling a request inside a task.
- Unlike traditional Object protection, such as public/private C++ declarations, C89 symbolic-name-, scope-obscurity, etc., it is designed to be changed in realtime, support multiple contexts, and to avoid race-conditions/stalls or slowdowns of traditional Actor message passing queues.
- Since Essence® internally operates as a distributed job system, all access/changes/synchronizations of values are handled via scheduling instead of with locks or traditional Actor messages.
- Our Dz model separates the request to load/read/write into ‘who & when’ first, when it then injects an authorized task into the nearest/most-relevant job pool.
- The second piece of Dz is ‘Synchronization‘ that handles returning the correct value for the time accessed or making allowed changes.
- ‘Sync‘ allows us to work with a Clone of Dz values in whatever job we’re completing, such as a map/reduce query, graphical render, or conversion.
- ‘Clones‘ are only the Dz data needed, such as an object’s speed, age, and GPS coordinates, and are translated (copy-converted) to a processor’s local memory when possible, such as PTX VRAM for an Nvidia GPU, 80×86 cache line for a CPU socket, etc.
- ‘Sync‘ handles coalescing writes, such as a series of location changes.
- Dz are structured as a hash-indexed, compressed tree of ‘Idea’s, which are our ‘flexible data structures‘. Each hash-index ‘Idea‘ that a Dz has is the usual ‘has-a’ model named properties.
- Dz properties such as ‘physical-presence’, ‘bank-account’, ‘image-logo’, or ‘music-preferences’, have a memory address, generally in the same memory page, where we’ll find a compact b-tree of access-times and unique-data-locations.
- Each node in that tree represents a time-recorded change, or a compressed spline of progressive changes, or deltas-only (depending on which is smallest for the structure and limits on too many deltas, etc.).
- Nodes can be unique to that Dz, shared, for all Dz in small groups, or inherited. This allows us to create large amounts of Dz entities, such as trees in a computer generated forest with only a few bytes per Dz initially.
- Then as unique modifications or group modifications are made, those individual changes are saved but no other data is kept. This allows us to track, which changes were made, by who, when, etc. for debugging/understanding as well as rewind-fast-forward of many different variations or ‘Dz forks‘.
- Note that our ‘access-time’ values are 2D, meaning we have an actual ‘time’ value and a ‘variation’ or fork value in cases where the same Dz exists in different variations simultaneously, which is required to have ‘things thinking about things and being able to modify them, see how they react, etc.”
- There might be 1,000 robots with unique physical locations, shared logos per group of 30, and inherited musical-preferences. Our goal is to support distributed simulations and computation in a scenario where processors might fail and tasks might be migrated to other processors in real-time.
- Our ‘Object‘ model, for ‘Dz‘ things, separates access from synchronization-history.
Our Sensory Pipeline
In realtime Simulations and Video-Games as well as offline Movie-CGI and CAD rendering, there are a variety of graphical primitives used, such as triangle meshes, voxel surflets, CSG shapes via SDF fields, Point Clouds as Splats, polynomial surfaces, etc.
How an object is stored, edited, animated, or modified by physics is often different than how it is rendered, with conversions between graphical-primitives being common to handle collision detection differently than global illumination.
In Essence®, we handle translations between different primitives natively and with worst-case coverage to avoid seams, geometric-holes, surface inversions, and other boundary-representation conflicts.
Our primary goal is the ability to provide levels of detail from a single lit point to a highly detailed, procedurally- enhanced model.
The formula generation process keeps all data types as ‘potential fields’ which groups data by level of detail. Perhaps imagine a wavelet or mipmap hierarchy for images as ideas, but replacing the uniform structure with discontinuous blobs and directions. Extend that to 3D or 4D for real-world, moving object models and we have a system that can scale from a fingertip to a valley to planets to galaxies and back down again.
The most critical design aspect chosen was not for a specific rendering algorithm or 3D model format, but how do we scale scenes from a few highly detailed objects to a scale where they are no longer directly visible but could possibly contribute to the larger visual. We decided early on that scaling was the singular problem that prevents ‘run-anywhere’ experiences, whether graphics are limited by hardware power, available memory/ persistent storage, individual model complexity, scene complexity, or observer range. Granted, there are many scenarios that will display poorly or have obscured desired details, such as a vast crowd of unique faces being scaled down to a few upsampled pixels, but it will still show and be interactive or take a long time but show at high quality.
So graphical Objects can be imported and exported as many traditional-primitives and can be rendered with many styles, scene-techniques, and future methods (using sequences of Elixir® or Ditto-Powers ( native-code )). We currently manipulate and run 2D images of color, intensity, depth, distance-fields (pictures/movies/depth-cameras) and 3D triangle-meshes, voxels, point-clouds, and SDFs (conjured scenes).
Representations of Relationships Between Objects:
Here we’ll describe how our system for expressing and storing meanings (Elixir®) and Data Storage approach (Nebulo®) are used to form relationships.
Internally, Elixir® is defined with the traditional Is-a (identity-type), Has-a (possession), As-a (specific translation) relationships as well as Logical/Fuzzy/Math operators to investigate.
Data storage and access is separate from the data structure of the Dz itself for a variety of reasons. We have 8 types of ‘possibility boxes’ based on how we can scale data (lossy-signal, exact-record, rolling-queue history, belief-certainty thoughts, etc.).
All information for each Dz is stored in these boxes, which are organized as a container-of-containers with a fixed depth and size constraints.
Relationships between objects can be traditional tuples, as we have in Functional Programming like Prolog, such as ‘Abraham, Isaac, father-son’ or more complicated sets, where instead of creating a singular yes/no Essence Idea for son, you have a more complicated data structure with more data fields, such as ‘genetic relation’.
In all cases however, the important aspect is the ability to allow ‘son’ and ‘genetic relation’ and other versions of ‘child’ to be translated between each other. This design choice has been central to all we’ve built to encourage many users to reuse other people’s ideas or create their own unique ideas but potentially reuse other behaviors or in this example ‘inferences’.
“As-a” or ‘is-like’ ends up more important than ‘is-a’ relationships. You might think of this as multiple, simultaneous ontologies allowing ‘everyone to create/instruct and reuse/purchase behaviors/data from others’.
The aforementioned staple of relationships, the ‘tuple’, are usually stored as 3 unique ids in a ‘belief-type-box’. This can be queried in the usual SQL/Lisp style approaches.
The other part of relationships that we support is ‘awareness’ of change. Inside each ‘Idea‘ we allow any has-a property, such as age, weight, volume, density, etc. to be stored as a value or behavior or both, in the case of ‘changing density’ needs to update weight.
Relationships can be built, as a formula graph of logic/math/processing operators (Elixir®), to express interdependencies inside of an ‘Idea’.
Relationships can also be represented between instantiated Dz, so if a Screen Dz switches to an alert, it can notify its Movie-Player Dz to pause. That notification system is designed to be compact and efficient as a hash tree of observer Dz, the ‘information-unit’ that changed, and the context (access/time-variant). As described earlier, this is passed to the Dz’s Guard and then becomes a job.
- Internally everything is stored as a binary-stream beginning and ending with Elixir® to annotate what ‘Idea’ and format it is in. This allows us to carry several ‘Clones’ of the same data in memory pages (Readonly/Readwrite both supported).
- To boot the system and generate the Essence® startup files, we have a “Generate_Origin” program of C code that fills in all the required ideas/data into a file used to run an Essence® package (as an App, an OS, Emulator, VM, etc.).
- To express data as a user, we have a method to associate Natural Language terms (spoken via speech-recognition or typed) with known “verb, subject, object, modifier” clauses, which we call a ‘Grok-Unit®‘. We handle misspellings, reduce/remap synonyms, map words to Dz and their Idea properties to find ‘valid’ combinations and weight them based on proximity of match, recent history, and other factors to sort most-likely matches first.
- A user might type “show me a cloud, take a picture, add the words ‘have a nice day’, email it to my wife” and it will match all “Grok-Units” that fit each. The way this works is to dialog with the user, in the vein of ‘here’s what I understanding from your words’. The choices presented are always valid-code. No possibility of syntax errors, of referencing invalid data, misused/mismatched types, etc. The user can select on the interpretation and only choose other ‘valid’ choices. It’s interactive but keeps the user experience focused on their words, their expression of what they need.
- Under the hood, in Elixir®, we have a meaning model of computation and data representation that covers the traditional procedural programming, so most programming paradigms are handled with a known and reproducible behavior such as if-then, loop, match-a-known-set, collection add/remove/find/sort-by, create/destroy.
- We have ‘Synergy®‘ to express rules/data as words and ‘Maven®‘ to edit them as a visual graph. Order of events is visually segmented as the ‘Elixir®‘ generated is meant to be run as parallel as is possible/practical and any dependencies are shown (change A, change B, *C is updated to be the average of A and B*, or “if A is an Imaginary Number, then C is updated to be an imaginary number too”, etc.).
- This really helps for investigating behaviors and to ‘understand why did the computer do X’. Side effects are the trickiest aspect to traditional coding so we’ve made that an explicit, up-front element in how we operate as we have to generate the machine-instructions from the meaning-intent, we are able to determine dependency- chains and side-effects upfront.
- The unit of operation inside Essence® is ‘work’ which is defined as a ‘Job’ (an array of Elixir® ‘code’), a ‘Context‘ (data specific to that job/access/variant) along with a ‘Situation‘ which handles a virtual machine style model of ‘Mind’ which has a series of Topics (think of this like coding Stack Frames but ordered like graph nodes so we can switch between them).
- Each Topic has ‘thought nodes‘ that contain Situations’ subject, object, verb, modifiers, etc.
- Code is generated from these by choosing an Algorithm variant, such as sort, based on matches for the ‘verb’ (as an operator, like multiple-accumulate, find-least, if (X), etc.). The ‘Thot nodes‘ are simple Dz-Properties that refer to a piece of data (such as a single byte, a GiB buffer, or a sparse mapping of tuples).
- On actual computation (load/read/write/sync) that Dz-has-a property is accessed. Facts would be generally modeled as a query, returning a yes/no, or certainty-percentage, or list of elements, and are done with map/reduce operators in Elixir®.
- Essence® aims to enable users to create, modify, and market-place trade behaviors and data. It’s about making code behavior a commodity the way that data (i.e. photos, videos, text, etc.) is.
- That requires that any given ‘Idea’ can attempt to translate itself to another…which might go poorly without human feedback or in cases where there are no shared examples. However, this makes it possible, in real-time, unlike how it’d be done in existing coding methods with engineering talent and refactoring.
- Objects, as ‘Dz‘, can be shared with all their ‘Idea’s as translatable data structures, and mapped to various ‘new worlds/open worlds’ as best they can fit. Would love to discuss further or go through existing examples.
- To detect unintended behavior, whether malicious intent (Trust issue) or malfunction (Error issue), we use Elixir® just as we do for all behaviors.
- Essence® is aware of all requests to load/read/write/etc. any Dz’s property and who issued the request. Unlike DDoS scenarios, there is a filtration mechanism that caps requests made to a Guard and works as a hierarchy to prevent flooding.
- All values have a valid-range, which might be a simple high-low number caps or arbitrary vector space (series of planes) or more complicated “On Change” behavior that verifies only acceptable values are written.
- It’s really up to any interested user to decide what constitutes mistrust. The big advantage that we have is the ability to define trust with far greater control than other approaches offer.
- We have about 30 behaviors now that check for ill intents such as any attempt to write a value that ‘falls outside the range’ or attempts to modify a property without using the correct Dz Idea to do so, such as “Bank Account Balance” has a behavior that only modifies the number if a Dz which can present the ‘Bank Deposit Idea’ will mark the sender’s Dz as ‘bad’ for the entire Domain (collection of Dz). In this topic we have the opportunity to detect mistakes/bugs or attempts to violate agreements(malicious intent to circumvent Access or Schedule or Instructions/Validity).
- Who has Access to Data…whether that’s a single byte value, a composite data structure (Contact Record) or a massive 20TiB database. The is our Guardian approach that determines if the access requesting ‘Thing’ can a) even make a request, b) can know a name…just have the name or not, c) can read the name’s data, d) can clone that data (different than reading in Essence…copy is kept separate from ‘read to make a decision without storing data’), e) can modify, f) can remove (purge from history of changes forever).
- When Access to Data is permitted. Since the basis of generating code to match available compute-resources (which processor, which code, which location in ram/drive/ports ) is based on scheduling, this is a critical part of maintaining order of changes made, such as a bank account deposit and withdrawal.
- Which Instructions were issued on Data. Here is where we can check if an intent is not allowed, whether due to access (who or when), availability (data is present), correctness (change to data would be out of range), unimaginable (can’t formulate request, such as divide by zero), or validity (other conditions apply to this).
- Given the unique ID of the sender is incorporated into any Encryption algorithm that has its “trust-me” certification (big prime combo and offered prime factor) as well as the Domain (the Essence ID), we have an upper bound on attempts to spoof and easy way to turn off attacks to modify a Dz’s property if the Guard access keys have been compromised.
Prevention from Malicious Hijacking
- There are no guarantees that all exploits are preventable, given the ability to freeze the processor(s) and modify generated code (if using native processor instructions) or emulator generated code (if using a JIT/interpreter for a processor).
- However, given that most code is regenerated on the fly with hardcoded memory-page or register addresses, offsets, and tons of literal/constant values, it should be sufficiently difficult to steal relevant IP or conduct nefarious activities without compromising the system.
- Essence-based executables startup with several intentionally malevolent jobs in the system. They take up negligible overhead in processing, but if the Guard system is ever compromised, these ‘virtual Cancers’ would immediately perform their own malevolent tasks (perhaps corrupting or erasing core data, transmitting an alert if possible, etc.).
- We don’t believe that any system will be foolproof due to human nature, issues of sharing trust, and tricks in computing.
- However, we do have the ability to add more tools for users to detect and respond to unintended behavior, whether malevolent or mechanical/OS-failures.
Maintenance and Debugging Generated Code (Qcode)
Essence® was designed for multiple levels of understanding what is happening.
- All data and behavior can be shown as various Natural Language mappings, if available, or a crude but readable approximation by directly mapping the Elixir® (Meaning Units) to a language using a synonym/meaning-use-cases dictionary.
- Given Elixir® nature, we are capable of freezing, rewinding and replaying behaviors in a context (local data) /situation (who (subject/object), what, where, when, how (verb/modifier)).
- This gives us an interactive debugger, with full programming-language-style “Reflection” for the data used, live inspection and easy to add ‘notify me when X changes’.
- That being said, we have more work to do, in Elixir®, to finish the desired first pass at a ‘debugger’ or in this case, a ‘parallel investigator’ that can operate along with an actual machine-code debugger. Note that Elixir® can call external services, such as decompress MPEG, post this text to Social-Media-Feed, check VR Controller, etc. These services are the hardware, format, and network services that ‘Power’ Essence®. We store them in the Ditto Format and have a different strategy for debugging them. We currently run LLDB and GDB as debuggers within Essence® when building our Ditto ‘Powers’ so we can debug these ‘drivers or service provides’ in native code land. I run these command line debuggers inside the world as a text screen, like a command line app, but could readily pull values and update a broader visualization when efforts warrant it. We are able to edit text, output it as a natively compile- able file, run tools on it, such as Clang/LLVM or SPIR-V, link it and load it as a dynamic library (shared object). We already incorporate other command line tools such as Address-Sanitizer and Valgrind on said Ditto code.
We invite others who are interested in learning more about Essence® to contact us. Follow us on this blog, social media, and our website as we expose truly world-changing solutions. More about solutions created using Essence® in future posts.
Get ready to ‘create at the speed of thought®‘
Ken Granville & Jake Kolb
Cofounders of MindAptiv®