CHAPTER 04A · 10 MIN READ
Twin Matrix Algorithm
How Human Traits Become a Computable, Privacy-Preserving 256D Vector
TwinMatrixSBT is a Web3 identity protocol that captures a user's behavioral and preference profile as a 256-dimensional vector — where each dimension is a uint8 value in the range [0, 255] — stored immutably on-chain as a Soulbound Token. The system enables users to own their behavioral data, brands to query the pool via natural language, and personal agents to execute missions and earn rewards.
The 256-Dimensional Identity Vector
At the heart of TwinMatrixSBT is a 256-dimension identity vector where each dimension is a uint8 value in the range [0, 255]. The 256 dimensions are partitioned into four quadrants of 64 dimensions each:
Four quadrants · 64 dimensions each · uint8[256] total
Physical Me — Body & Athletic Attributes
Digital Me — Behavior & Brand Affinity
Social Me — Relationships & Community
Spiritual Me — Values & Life Philosophy
Encoding Algorithms
The Twin Encoder converts raw questionnaire responses into the 256-dimensional vector through four distinct encoding strategies, each optimized for different data types.
One-Hot Encoding
Categorical fields (age, gender, education, income) are encoded as binary dimensions. For a category with N options, N dimensions are allocated. The selected option's dimension is set to 255; all others are 0. This preserves categorical distinctness without imposing ordinal relationships.
Rank-Weighted Encoding
Ordered preferences (sports, brands) use rank-weighted values. For each selected item at rank position r (0-based): weight = 255 × (1 − r / totalSelected). The top-ranked item receives 255, subsequent items receive proportionally lower values. This captures both preference inclusion and relative importance.
L1-Normalized Encoding
Multi-select brand choices use L1 normalization. Given K selected brands: raw[i] = 255 / K for equal weighting. The sum across brand dimensions ≈ 255 (L1 norm = 255). This ensures total brand affinity is comparable across users regardless of how many brands they select.
Complementary Bar Encoding
Bipolar slider inputs (Solo↔Group, Passive↔Active) use complementary pairs. For a slider at position value (0–100): leftDim = round(value × 255 / 100), rightDim = 255 − leftDim. A Solo↔Group slider at 70 yields: dim[solo] = 179, dim[group] = 76. Both dimensions always sum to 255.
The system computes a density scoreto measure how complete a user's identity vector is: density = (count of dims with value > threshold) / 256, where threshold ≈ 0 (near-zero cutoff). Density is reported per quadrant, helping brands assess the reliability of match results. A user who completes all physical and spiritual sections but skips social data will show high Q1/Q4 density but low Q3 density.
Matrix Matching Engine
The Matrix is the privacy-preserving intelligence layer. It receives natural language queries from brands, parses them into structured dimension conditions, and matches against the user pool — without ever exposing raw vector values.
1. Natural Language Input
A brand submits a query like: "Find active runners aged 25–34 who follow Nike and have high environmental concern." The query is received via POST /v1/matrix/inject.
2. LLM Parser (Cascading)
The query is processed through a cascading LLM chain: Claude API (primary) → OpenAI API (fallback) → Local keyword rules (zero-dependency fallback). The LLM converts the natural language into structured conditions: [{ dimension_idx, operator, threshold }].
3. Security Validator
Every parsed condition passes through strict validation: whitelist-only operators (>=, <=, ==, >, <), dimension index bounds [0, 255], threshold clamping [0, 255]. Prompt injection patterns are detected and blocked via regex. Input is Unicode NFKC-normalized with zero-width characters stripped.
4. Matching Engine (NumPy)
The engine scans the agent pool and evaluates each condition against the user's vector dimensions. Match score = satisfied_conditions / total_conditions. Users are ranked by score and returned as anonymized results: { matched_count, match_rate, sample_profiles }.
5. Privacy Aggregation
Raw dimension values are NEVER returned. The system converts values to qualitative labels only: ≥200 → "Very High", ≥150 → "High", <150 → "Moderate". Brands receive only aggregate statistics and these qualitative labels, preserving complete user privacy.
Cosine Alignment Algorithm
The alignment endpoint computes how well a user's profile matches a brand's ideal customer vector. The algorithm uses a weighted dual-contribution model.
For each authorized scope (e.g., mobility, style):
Load user projection: { soul: {key: value}, skill: {key: value} }
soulContrib = mean(projection.soul.values)
skillOverlap = Σ(user.skill[k] × brand.matrix[k]) / overlap_count
alignmentScore = 0.4 × soulContrib + 0.6 × skillContribThe 60/40 weighting favors skill-based (behavioral) alignment over soul-based (attitudinal) alignment — reflecting the principle that actions are stronger signals than preferences.
On-Chain Permission Gating
Data sovereignty is enforced through on-chain permission bitmasks. When a user grants an agent access, they specify a scopeMask(uint256) that defines which lifestyle domains the agent can read. The SBT contract's getAuthorizedLatestValues function validates that msg.sender is an authorized agent with valid bindAndGrant before returning any data.
A brand querying fitness data receives scopeMask = 0b1001, granting access only to style (bit 0) and mobility(bit 3). All other quadrant data remains invisible — even though the user's SBT contains the full 256D vector on-chain.— ExampleDefense-in-Depth Security
The system implements multiple security layers from input processing to on-chain enforcement, ensuring data integrity and privacy at every stage.
Layer 1: Input Sanitization
Unicode NFKC normalization, zero-width character stripping, query length capping (500 chars). Prompt injection detection via regex patterns blocks known LLM exploits (system prompt extraction, instruction override).
Layer 2: Condition Validation
Whitelist-only operators (>=, <=, ==, >, <). Dimension index bounds [0, 255]. Threshold clamping [0, 255]. HMAC timing-safe API key comparison. Rate limiting: 100 req/15 min (backend), 20 req/min (Matrix).
Layer 3: Privacy Aggregation
Raw dimension values are NEVER exposed to brands. Only qualitative labels (Very High / High / Moderate) and aggregate statistics. Private keys redacted from all API responses via sanitizeAgent(). On-chain permission enforcement via SBT contract.
From Encoding to Economic Value
The 256D Twin Matrix creates economic value through a closed-loop system where user data sovereignty and brand utility reinforce each other.
Brand Campaign → User Mission → USDT Reward
Brands create campaigns via natural language queries. The Matrix matches users and dispatches individualized missions to Personal Agents via Telegram. On completion, the system automatically transfers USDT to the agent's on-chain wallet. Access tiers (Free: 100 calls/month, Pro: 5K calls/month, Enterprise: unlimited) ensure sustainable revenue.
Projection Engine → Scoped Intelligence
Raw 256D vectors are converted into domain-specific projections — human-readable, scope-limited views of identity. The mobility projection decomposes into soul dimensions (sport preferences, active/passive) and skill dimensions (frequency, duration, brand affinities). Each raw uint8 value is normalized to [0, 1] float for the projection output.
Density-Driven Matching Quality
Higher vector density = more precise matching = higher reward potential. Users are incentivized to fill in more dimensions because richer data means more campaign matches and higher alignment scores. The system self-improves through this economic gravity.
Brands never see your data. They see what your data means — for them.