hosh - Operable HaSH

pypi Python version license: GPL v3 arXiv Downloads

This library provides a richer approach to hashing.
Here, hash digests can be combined through a reversible operation. In practice, this means that, e.g., a sequence of data transformation steps can be identified by a single hash, instead of a list of hashes. We call each such identifier a hosh (operable hash). A handful of colored pretty printing methods is also featured to ease visual comparison between hashes.

Basic Usage

The library is mainly represented by the class Hosh which provides some math operators and useful methods.

Multiplication

A product of hoshes generates a new hosh as shown below, where sequences of bytes (b"···") are provided as arguments to simulate some binary objects intended to be hashed. Here, multiplication is non commutative (probabilistically speaking - see our paper).

Examples

from hosh.theme import HTML
from hosh import Hosh, ø, setup  # ø is a shortcut for identity (AltGr+O in most keyboards)

# Hoshes (operable hash-based elements) can be multiplied.
a = Hosh(content=b"Some large binary content...")
b = Hosh(content=b"Some other binary content. Might be, e.g., an action or another large content.")
c = a * b
print(f"{a} * {b} = {c}")

8CG9so9N1nQ59uNO8HGYcZ4ExQW5Haw4mErvw8m8 * 7N-L-10JS-H5DN0-BXW2e5ENWFQFVWswyz39t8s9 = z3EgxfisgqbNXBd0eqDuFiaTblBLA5ZAUbvEZgOh


# Colored:
print(f"{a.html} * {b.html} = {c.html}")

8CG9so9N1nQ59uNO8HGYcZ4ExQW5Haw4mErvw8m8 * 7N-L-10JS-H5DN0-BXW2e5ENWFQFVWswyz39t8s9 = z3EgxfisgqbNXBd0eqDuFiaTblBLA5ZAUbvEZgOh

# Global setup for light HTML (web) colors instead of dark ANSI (terminal).
setup(dark_theme=False, format=HTML)
Hosh.__str__ = Hosh.__repr__  # <- a little trick to allow print() in colors
print(~b)

Q6OjmYZSJ8pB3ogBVMKBOxVp-oZ80czvtUrSyTzS

# Multiplication can be reverted by the inverse hosh. Zero is the identity hosh.
print(f"{b} * {~b} = {b * ~b} = 0")

7N-L-10JS-H5DN0-BXW2e5ENWFQFVWswyz39t8s9 * Q6OjmYZSJ8pB3ogBVMKBOxVp-oZ80czvtUrSyTzS = 0000000000000000000000000000000000000000 = 0

print(f"{b} * {ø} = {b * ø} = b")

7N-L-10JS-H5DN0-BXW2e5ENWFQFVWswyz39t8s9 * 0000000000000000000000000000000000000000 = 7N-L-10JS-H5DN0-BXW2e5ENWFQFVWswyz39t8s9 = b

print(f"{c} * {~b} = {c * ~b} = {a} = a")

z3EgxfisgqbNXBd0eqDuFiaTblBLA5ZAUbvEZgOh * Q6OjmYZSJ8pB3ogBVMKBOxVp-oZ80czvtUrSyTzS = 8CG9so9N1nQ59uNO8HGYcZ4ExQW5Haw4mErvw8m8 = 8CG9so9N1nQ59uNO8HGYcZ4ExQW5Haw4mErvw8m8 = a

print(f"{~a} * {c} = {~a * c} = {b} = b")

RNvSdLI-5RiBBGL8NekctiQofWUIeYvXFP3wvTFT * z3EgxfisgqbNXBd0eqDuFiaTblBLA5ZAUbvEZgOh = 7N-L-10JS-H5DN0-BXW2e5ENWFQFVWswyz39t8s9 = 7N-L-10JS-H5DN0-BXW2e5ENWFQFVWswyz39t8s9 = b

# Division is shorthand for reversion.
print(f"{c} / {b} = {c / b} = a")

z3EgxfisgqbNXBd0eqDuFiaTblBLA5ZAUbvEZgOh / 7N-L-10JS-H5DN0-BXW2e5ENWFQFVWswyz39t8s9 = 8CG9so9N1nQ59uNO8HGYcZ4ExQW5Haw4mErvw8m8 = a

# Hosh multiplication is not expected to be commutative.
print(f"{a * b} != {b * a}")

z3EgxfisgqbNXBd0eqDuFiaTblBLA5ZAUbvEZgOh != wwSd0LaGvuV0W-yEOfgB-yVBMlNLA5ZAUbvEZgOh

# Hosh multiplication is associative.
print(f"{a * (b * c)} = {(a * b) * c}")

RuTcC4ZIr0Y1QLzYmytPRc087a8cbbW9Nj-gXxAz = RuTcC4ZIr0Y1QLzYmytPRc087a8cbbW9Nj-gXxAz

# An element can be multiplied by itself a certain number of times by the `^` operator.
print(f"a³ = {a ^ 3} = {a * a * a}")

a³ = pFBLMSKyBffSXQSdmgPgW4cgWFbH4wwd2Niuxp2p = pFBLMSKyBffSXQSdmgPgW4cgWFbH4wwd2Niuxp2p

Addition

The order of elements has no meaning in some scenarios, e.g.: items inside a bag; or, steps that produce the same result despite which one is performed before another one. This requires a commutative operation - provided by the + operator.

Examples

from hosh.theme import HTML
from hosh import Hosh, ø, setup  # ø=identity

# Enable HTML colors. See 'multiplication' example for details.
setup(dark_theme=False, format=HTML)
Hosh.__str__ = Hosh.__repr__

# Hoshes can be added.
a = Hosh(content=b"Some large binary content...")
b = Hosh(content=b"Some other binary content. Might be, e.g., an action or another large content.")
c = a + b
print(f"{a} + {b} = {c}")

8CG9so9N1nQ59uNO8HGYcZ4ExQW5Haw4mErvw8m8 + 7N-L-10JS-H5DN0-BXW2e5ENWFQFVWswyz39t8s9 = 4i8YqqVswkwbMv2NKCB.p2JpsuLLA5ZAUbvEZgOh

# Addition can be reverted by subtraction. Zero is the identity hosh.
# Warning: unary operators `+b` and `-b` are not related to addition nor inversion.
# They are reserved for the 'lift' operation - see advanced example later in documentation.
print(f"{b} - {b} = {b - b} = 0")

7N-L-10JS-H5DN0-BXW2e5ENWFQFVWswyz39t8s9 - 7N-L-10JS-H5DN0-BXW2e5ENWFQFVWswyz39t8s9 = 0000000000000000000000000000000000000000 = 0

print(f"{b} + {ø} = {b + ø} = b")

7N-L-10JS-H5DN0-BXW2e5ENWFQFVWswyz39t8s9 + 0000000000000000000000000000000000000000 = 7N-L-10JS-H5DN0-BXW2e5ENWFQFVWswyz39t8s9 = b

print(f"{c} - {a} = {c - a} = {b} = b")

4i8YqqVswkwbMv2NKCB.p2JpsuLLA5ZAUbvEZgOh - 8CG9so9N1nQ59uNO8HGYcZ4ExQW5Haw4mErvw8m8 = 7N-L-10JS-H5DN0-BXW2e5ENWFQFVWswyz39t8s9 = 7N-L-10JS-H5DN0-BXW2e5ENWFQFVWswyz39t8s9 = b

# Hosh addition is always commutative.
print(f"{a + b} == {b + a}")

4i8YqqVswkwbMv2NKCB.p2JpsuLLA5ZAUbvEZgOh == 4i8YqqVswkwbMv2NKCB.p2JpsuLLA5ZAUbvEZgOh

# Hosh addition is associative.
print(f"{a + (b + c)} = {(a + b) + c}")

lywZuVigAtLiwnN02fb.9C7PUYecbbW9Nj-gXxAz = lywZuVigAtLiwnN02fb.9C7PUYecbbW9Nj-gXxAz

Advanced Usage

This package works well out of the box for most usages. However, some scenarios can take advantage of the math features embbeded in the library.

Element types

Some elements are commutative for multiplication. They are rare enough to be probabilistically inexistent for a normal usage of hoshes. However, their subgroup is still large enough to provide interesting features.

Examples

from hosh.theme import HTML
from hosh import Hosh, ø, setup  # ø=identity

print("ongoing documentation work!")

ongoing documentation work!

Lifting

hosh provides a lifting mechanism inspired by a category theory concept used in Haskell. Lifting is achieved by applying a unary operator to map a commutative element to a non commutative subgroup. Operator + maps to higher lexicographic rank, whilst operator - maps to a lower lexicographic rank. This enables us to temporarily impose order when combining two commutative elements.

Examples

from hosh.theme import HTML
from hosh import Hosh, ø, setup  # ø=identity

print("ongoing documentation work!")

ongoing documentation work!

Real-world Usage

A full fledged application using hoshes is hdict - presented below. More information can be found at: Presentation, packages (hosh, hdict) or code repositories (hosh, hdict).

Author Statement (CRediT)

Davi Pereira-Santos: Conceptualization, Investigation, Software, Validation, Writing, Review & Editing. Gabriel Dalforno Silvestre: Formal analysis, Software (groups prospection), Writing (Sections 4.2 and 5). André C. P. L. F. Carvalho: Supervision, Funding acquisition, Resources, Review & Editing.

Acknowledgment

The seminal work that inspired the evolving ideias in this library was supported by CNPq and FAPESP [grant numbers 2013/07375-0, 2019/01735-0 (CEPID CeMEAI)]. We are also grateful for the initial advice from Mark Gritter and Jyrki Lahtonen in some topics of group theory.




hdict { a unique data structure }

pypi Python version license: GPL v3 Downloads

Overview

Shortly: A data structure based on a novel identification paradigm useful for frictionless computing, experiments, distributed data, among others.
Formally: Hosh-based cacheable lazy dict with predictable/deterministic universally (probabilistically guaranteed) unique identifiers.

The Concept

hdict is like an ordinary dict with str keys. Each entry, called field, and the hdict itself, are identified by 40-digit hashes (see hosh). A hdict object (say d) provides two extra entries: _id (hdict identifier) and _ids (field identifiers), accessible through square brackets or through the shortcuts d.id and d.ids.
Examples

from hdict import hdict
from hosh.theme import HTML
from hosh import setup, Hosh

# For better integration within the documentation, we change the color theme.
setup(dark_theme=False, format=HTML)

# From named arguments.
d = hdict(x=5, y=7, z=10)

# From a dict object.
d = hdict({"x": 5, "y": 7, "z": 10})

# From an empty 'hdict' object.
d = hdict() >> {"x": 5} >> {"y": 7, "z": 10}

# All three options have the same result.
d.show()

{
    x: 5,
    y: 7,
    z: 10,
    _id: BN-3Q3Z.2Q.9nsbIYnOI75HT7xhgjvF6wErwBPTn,
    _ids: {
        x: ecvgo-CBPi7wRWIxNzuo1HgHQCbdvR058xi6zmr2,
        y: eJCW9jGsdZTD6-AD9opKwjPIOWZ4R.T0CG2kdyzf,
        z: u-Yykj2nDtKaUMGzfqScX5Y14qC7eqJrO7lXrJ1m
    }
}

A field contains a value or a function application. A field pointing to an application is only evaluated on demand, i.e., lazily.

This documentation is still an ongoing work. It has been growing steadly. Last update: Feb/2023

More advanced topics will be added in the near future.

...

Short syntax

...

Copyright © 2023 - Davi Pereira dos Santos - All rights reserved