E36F68A5FFD842155441C05E22A7E5CD44A86966D20A384CD551A98A9A087E25EFBFD32AEFF20F8C72E915B339DC8BF6F1F3A6D42757B6C37B8F37AB2F4C451BED1E87611CBB541A8FC5B9C7A35B3AA6621CA8DCE22C3F1F0158E84CB65FCECF66320A70D8D2B941DC22D8820060374E66DE5DDBABE64873F8ED83A92E76AA67ED18180CD13740D0B5043C8E1FF746C15BE6DB5D1E63E645FDC6E8978C0B49CA40E887621279E2C7AFF33B25D4AEF0ABA762BD2E

CHAPTER 04A · 10 MIN READ

Twin Matrix Algorithm

How Human Traits Become a Computable, Privacy-Preserving 256D Vector

TwinMatrixSBT is a Web3 identity protocol that captures a user's behavioral and preference profile as a 256-dimensional vector — where each dimension is a uint8 value in the range [0, 255] — stored immutably on-chain as a Soulbound Token. The system enables users to own their behavioral data, brands to query the pool via natural language, and personal agents to execute missions and earn rewards.

THE IDENTITY VECTOR

The 256-Dimensional Identity Vector

At the heart of TwinMatrixSBT is a 256-dimension identity vector where each dimension is a uint8 value in the range [0, 255]. The 256 dimensions are partitioned into four quadrants of 64 dimensions each:

Four quadrants · 64 dimensions each · uint8[256] total

Q1 · DIMS [0–63]

Physical Me — Body & Athletic Attributes

[0–6]

Age bracket— One-hot: 18–24, 25–34, 35–44, 45–54, 55–64, 65+, Undisclosed

[7–11]

Gender— One-hot: Male, Female, Non-binary, Prefer-not-to-say, Other

[12–15]

Weight bracket— One-hot: <55 kg, 55–70, 70–85, 85+

[16–19]

Height bracket— One-hot: <160 cm, 160–170, 170–180, 180+

[20–31]

Sport metrics— Frequency (3d), Duration (4d), Daily steps (4d)

[32–63]

Sport ranking— Rank-weighted value per sport (running, cycling, swimming, trail, strength, yoga...)

Q2 · DIMS [64–127]

Digital Me — Behavior & Brand Affinity

[64–73]

Brand affinity— L1-normalized across selected brands (Nike, Adidas, New Balance, ASICS...)

[74–85]

Crypto literacy— DeFi experience, NFT activity, programming skill

[86–127]

Behavioral bars— Slider pairs: Passive↔Active, Indoor↔Outdoor

Q3 · DIMS [128–191]

Social Me — Relationships & Community

[128–129]

Occupation— One-hot: Student, Professional, Freelancer, Retired, Other

[130–133]

Education— One-hot: High school, Bachelor's, Master's, Doctorate

[134–137]

Income bracket— One-hot: <$30k, $30k–$60k, $60k–$100k, $100k+

[138–141]

Relationship— One-hot: Single, In a relationship, Married, Divorced

[142–149]

Urban/Rural— One-hot: Urban, Suburban, Rural

[150–163]

Social bars— Slider pairs: Solo↔Group, Introvert↔Extrovert, Leader↔Follower

Q4 · DIMS [192–255]

Spiritual Me — Values & Life Philosophy

[192–201]

Environmental concern— Recycling, carbon footprint awareness, sustainable consumption

[202–211]

Risk tolerance— Financial risk appetite, adventure seeking

[212–221]

Mission drive— Purpose alignment, social impact preference

[222–255]

Soul bars— Slider pairs: Outcome↔Experience, Conservative↔Progressive, Tradition↔Innovation

ENCODING ALGORITHMS

Encoding Algorithms

The Twin Encoder converts raw questionnaire responses into the 256-dimensional vector through four distinct encoding strategies, each optimized for different data types.

STAGE 01

One-Hot Encoding

Categorical fields (age, gender, education, income) are encoded as binary dimensions. For a category with N options, N dimensions are allocated. The selected option's dimension is set to 255; all others are 0. This preserves categorical distinctness without imposing ordinal relationships.

STAGE 02

Rank-Weighted Encoding

Ordered preferences (sports, brands) use rank-weighted values. For each selected item at rank position r (0-based): weight = 255 × (1 − r / totalSelected). The top-ranked item receives 255, subsequent items receive proportionally lower values. This captures both preference inclusion and relative importance.

STAGE 03

L1-Normalized Encoding

Multi-select brand choices use L1 normalization. Given K selected brands: raw[i] = 255 / K for equal weighting. The sum across brand dimensions ≈ 255 (L1 norm = 255). This ensures total brand affinity is comparable across users regardless of how many brands they select.

STAGE 04

Complementary Bar Encoding

Bipolar slider inputs (Solo↔Group, Passive↔Active) use complementary pairs. For a slider at position value (0–100): leftDim = round(value × 255 / 100), rightDim = 255 − leftDim. A Solo↔Group slider at 70 yields: dim[solo] = 179, dim[group] = 76. Both dimensions always sum to 255.

VECTOR DENSITY METRIC

The system computes a density scoreto measure how complete a user's identity vector is: density = (count of dims with value > threshold) / 256, where threshold ≈ 0 (near-zero cutoff). Density is reported per quadrant, helping brands assess the reliability of match results. A user who completes all physical and spiritual sections but skips social data will show high Q1/Q4 density but low Q3 density.

MATCHING ENGINE

Matrix Matching Engine

The Matrix is the privacy-preserving intelligence layer. It receives natural language queries from brands, parses them into structured dimension conditions, and matches against the user pool — without ever exposing raw vector values.

STAGE 01

1. Natural Language Input

A brand submits a query like: "Find active runners aged 25–34 who follow Nike and have high environmental concern." The query is received via POST /v1/matrix/inject.

STAGE 02

2. LLM Parser (Cascading)

The query is processed through a cascading LLM chain: Claude API (primary) → OpenAI API (fallback) → Local keyword rules (zero-dependency fallback). The LLM converts the natural language into structured conditions: [{ dimension_idx, operator, threshold }].

STAGE 03

3. Security Validator

Every parsed condition passes through strict validation: whitelist-only operators (>=, <=, ==, >, <), dimension index bounds [0, 255], threshold clamping [0, 255]. Prompt injection patterns are detected and blocked via regex. Input is Unicode NFKC-normalized with zero-width characters stripped.

STAGE 04

4. Matching Engine (NumPy)

The engine scans the agent pool and evaluates each condition against the user's vector dimensions. Match score = satisfied_conditions / total_conditions. Users are ranked by score and returned as anonymized results: { matched_count, match_rate, sample_profiles }.

STAGE 05

5. Privacy Aggregation

Raw dimension values are NEVER returned. The system converts values to qualitative labels only: ≥200 → "Very High", ≥150 → "High", <150 → "Moderate". Brands receive only aggregate statistics and these qualitative labels, preserving complete user privacy.

ALIGNMENT SCORING

Cosine Alignment Algorithm

The alignment endpoint computes how well a user's profile matches a brand's ideal customer vector. The algorithm uses a weighted dual-contribution model.

ALIGNMENT FORMULA

For each authorized scope (e.g., mobility, style):
  Load user projection: { soul: {key: value}, skill: {key: value} }
  
  soulContrib   = mean(projection.soul.values)
  skillOverlap  = Σ(user.skill[k] × brand.matrix[k]) / overlap_count
  
  alignmentScore = 0.4 × soulContrib + 0.6 × skillContrib

The 60/40 weighting favors skill-based (behavioral) alignment over soul-based (attitudinal) alignment — reflecting the principle that actions are stronger signals than preferences.

User Agent

Alignment Score

Brand Agent

User Agent

Alignment Score

Brand Agent

PERMISSION SYSTEM

On-Chain Permission Gating

Data sovereignty is enforced through on-chain permission bitmasks. When a user grants an agent access, they specify a scopeMask(uint256) that defines which lifestyle domains the agent can read. The SBT contract's getAuthorizedLatestValues function validates that msg.sender is an authorized agent with valid bindAndGrant before returning any data.

bit 0styleFashion and clothing preferences

bit 1foodDietary habits and food preferences

bit 2homeLiving environment and home lifestyle

bit 3mobilityTransportation, fitness and sport data

bit 4entertainmentMedia, gaming, and leisure

bit 5learningEducation and skill development

bit 6beautyPersonal care and beauty products

A brand querying fitness data receives scopeMask = 0b1001, granting access only to style (bit 0) and mobility(bit 3). All other quadrant data remains invisible — even though the user's SBT contains the full 256D vector on-chain.

— Example

SECURITY ARCHITECTURE

Defense-in-Depth Security

The system implements multiple security layers from input processing to on-chain enforcement, ensuring data integrity and privacy at every stage.

Layer 1: Input Sanitization

Unicode NFKC normalization, zero-width character stripping, query length capping (500 chars). Prompt injection detection via regex patterns blocks known LLM exploits (system prompt extraction, instruction override).

Layer 2: Condition Validation

Whitelist-only operators (>=, <=, ==, >, <). Dimension index bounds [0, 255]. Threshold clamping [0, 255]. HMAC timing-safe API key comparison. Rate limiting: 100 req/15 min (backend), 20 req/min (Matrix).

Layer 3: Privacy Aggregation

Raw dimension values are NEVER exposed to brands. Only qualitative labels (Very High / High / Moderate) and aggregate statistics. Private keys redacted from all API responses via sanitizeAgent(). On-chain permission enforcement via SBT contract.

DATA VALUE MODEL

From Encoding to Economic Value

The 256D Twin Matrix creates economic value through a closed-loop system where user data sovereignty and brand utility reinforce each other.

Brand Campaign → User Mission → USDT Reward

Brands create campaigns via natural language queries. The Matrix matches users and dispatches individualized missions to Personal Agents via Telegram. On completion, the system automatically transfers USDT to the agent's on-chain wallet. Access tiers (Free: 100 calls/month, Pro: 5K calls/month, Enterprise: unlimited) ensure sustainable revenue.

Projection Engine → Scoped Intelligence

Raw 256D vectors are converted into domain-specific projections — human-readable, scope-limited views of identity. The mobility projection decomposes into soul dimensions (sport preferences, active/passive) and skill dimensions (frequency, duration, brand affinities). Each raw uint8 value is normalized to [0, 1] float for the projection output.

Density-Driven Matching Quality

Higher vector density = more precise matching = higher reward potential. Users are incentivized to fill in more dimensions because richer data means more campaign matches and higher alignment scores. The system self-improves through this economic gravity.

Brands never see your data. They see what your data means — for them.