Dimensional Directory
A flexible and extensible unit-based data management system that redefines how information is stored, organized, and retrieved.
System Overview
The Dimensional Directory (DD) is a cornerstone of FractalWaves' computational infrastructure, providing a fundamentally new approach to data management. By treating information units as geometric entities with precise coordinates in an information space, the DD creates a framework that transcends traditional database limitations. At its core, the system uses a table-based indexing approach that maps zero-indexed addresses to UUIDs, which function as pointers to the exact location of data in HDF5 storage.
Core Innovation
The DD combines zero-indexed addressing with UUID-based deduplication, creating an information system that maintains both human-readable locations and perfect content deduplication - a duality that mirrors C-Space theory's fundamental principles.
Practical Impact
This architecture enables unprecedented efficiency in storing, retrieving, and relating information units across scales, from tokens to documents to entire knowledge domains, while preserving perfect referential integrity.
Key Features
UUID-Based Deduplication with Zero-Indexed Access
Identical content is stored once in HDF5 using UUIDs, but can be referenced through human-readable zero-indexed addresses via a table-based indexing system.
Table-Based Indexing System
Maps between zero-indexed addresses and UUIDs to efficiently locate data in HDF5 storage, providing a layer of indirection between user-facing addresses and physical storage.
Dual-Database Architecture
Combines SQLite for structured metadata with HDF5 for raw data and embeddings, optimizing for both relational integrity and high-dimensional vector operations.
Extensible Type System
Type-specific managers provide specialized operations for different content types, enabling domain-specific optimization within a unified framework.
Relation Management
Create and manage typed relationships between content units, enabling complex knowledge graphs with metadata-rich connections.
Embedding Support
First-class support for vector embeddings enables semantic operations across the information space, bridging symbolic and geometric representations.
System Architecture
The Dimensional Directory architecture integrates multiple specialized components that work together to provide a unified data management experience across different information types and scales.
Zero-Indexed Addressing and UUID Mapping
The DD uses a hierarchical addressing scheme with a table-based mapping system that serves as a layer of indirection between human-readable addresses and the actual data storage:
Address Format & Mapping
doc:123
↪ Maps to a UUID in the indexing table
doc:123-0
↪ Maps to a specific UUID in HDF5
doc:123-0.0
↪ Maps to a specific UUID in HDF5
System Benefits
- •Human-readable addressing for users
- •Direct access to HDF5 data via UUID pointers
- •Efficient storage through deduplication
- •Fast retrieval with optimized indexing
How It Works
The zero-indexed addresses are mapped to UUIDs through an indexing table, and these UUIDs serve as pointers to the exact locations of data in the HDF5 database. This approach creates an abstraction layer over the physical storage, allowing for both human-friendly addressing and highly efficient data retrieval.
Applications & Benefits
Practical Applications
- •Knowledge Management: Unified storage for documents, data, and their interrelationships
- •Data Integration: Seamless bridging between structured and unstructured information
- •Content Generation: Foundation for coherent multi-scale content creation
- •Semantic Search: Unified symbolic and vector-based retrieval of information
Key Benefits
- •Performance: Direct access to HDF5 data locations through UUID pointers
- •Scalability: Indexing system scales efficiently with growing data volumes
- •Flexible Addressing: Human-readable navigation through complex information spaces
- •Extensible Framework: Adaptable to domain-specific information needs
Implementation Example
Basic Usage
from dimensional_directory.service.dd_service import EnhancedDDService # Initialize the service service = EnhancedDDService(base_path="./dd_data") # Process a document - returns UUID information result = service.process_input( data="This is a sample document. It contains multiple sentences.", input_type="document", context_id="doc-001" ) # Get UUID for a zero-indexed address uuid = service.get_uuid_for_address("doc-001-0") print(uuid) # "550e8400-e29b-41d4-a716-446655440000" # Get content using the zero-indexed address # (internally translates to UUID for HDF5 access) unit = service.get_unit("doc-001-0", resolve_type="address") print(unit['content']) # "This is a sample document." # Get content using UUID directly unit = service.get_unit(uuid, resolve_type="uuid") print(unit['content']) # "This is a sample document."
Advanced HDF5 Access
# Get raw HDF5 data for a specific unit hdf5_data = service.get_hdf5_data("doc-001-0", data_type="raw") # Get structured HDF5 data structured_data = service.get_hdf5_data("doc-001", data_type="structured") # Get embedding from HDF5 embedding = service.get_hdf5_data("doc-001-0", data_type="embedding") # Working with embeddings import numpy as np # Add an embedding for a unit (maps to UUID internally) embedding = np.random.rand(768).tolist() # Example vector service.add_embedding("doc-001-0", embedding, unit_type="address") # Semantic search using embeddings similar_units = service.similarity_search( embedding, top_k=5, threshold=0.7 )