Sophia database and its architecture was born as a result of research and reconsideration of primary alghorithmic constraints that relate to growing popular Log-file based data structures, such as LSM-tree, B-tree, etc.

Most Log-based databases tend to organize own file storage as a collection of sorted files which are periodically merged. Thus, without applying some key filtering scheme (like Bloom-filter) in order to find a single key, database has to traverse all files that can take up to O(files_count * log(file_key_count)) in the worst case, and it's getting even worse for range scans, because Bloom-filter is incapable to operate with key order.

Sophia was designed to improve this situation by providing faster read while still getting benefit from append-only design.

Following sections show Sophia design and evolution from version 1.1 to 1.2 (latest).

Dmitry Simonenko