# SpaceCraft — Datamining Report

Extracted from the local install (Steam app **3276050**, build dated Jun 11–12 2026). All findings come from offline asset extraction only — no network capture, no live-server interaction.

## What the install is made of

The game is built on **Heaps.io + HashLink** (Shiro Games' in-house stack). The two files that matter:

- **`hlboot.dat`** (12 MB) — the entire game logic as HashLink bytecode (`HLB\x04`, version 4). Decompilable.
- **`res.pak`** (16.37 GB) — Heaps PAK archive holding every asset *and* the game database.

Everything else is engine/vendor DLLs (DX12, DLSS, FMOD, SDL2, Steam, OpenAL).

## res.pak — fully indexed

I reverse-engineered the PAK format (header → `version`, `headerSize`, `dataSize`; then a name-first recursive file tree; large-file offsets stored as 8-byte doubles because the archive exceeds 4 GB) and parsed the whole table of contents. The parse is exact: the sum of all file sizes equals the archive size to the byte.

**840 directories, 8,230 files.** By type:

| count | type | notes |
|------:|------|-------|
| 3,265 | .png | textures (9.7 GB) |
| 2,189 | .fbx | models (5.3 GB) |
| 1,717 | .prefab | entity/scene definitions |
| 365 | .props | per-asset material/model props |
| 257 | .fx / 144 .shgraph | shaders |
| 72 | .dds | (1.2 GB) |
| **1** | **.cdb** | **the entire game database (2.7 MB)** |

Full listing with offsets/sizes: `pak_filelist.tsv`.

## data.cdb — the game database (the real prize)

A single **CastleDB** file (JSON) holds **218 sheets**. This is the design data for the whole game, human-readable. Highlights:

- **585 items**, **479 crafting recipes** (full input→output→building chains), **203 resources**, **189 permits** (tech unlocks), **138 item types**, **236 attributes**, **51 skills**.
- **World structure is authored, not random:** 24 named **sectors** (Threshold, Terminus, Cairn, Landmark, Moor, Fool's Rim, Strand, Horizon, Ebb, Axon, Idol…), each with a GUID and a full `generation` block; 9 named **systems** (Solar Alpha, Janus, Hypatia, Sawma, Amundsen, Nevada, Wisehole…); 15 planet templates; 42 `planetGen` rules; 95 `resGen` resource-spawn rules.
- **11 factions:** The Company, Space Pirates, IUOB, Stellar Engineering, Ninmah Group, Eris Corp., Vox Populi, Children of Gaia, The Republic, plus Players/Monster.
- **40 ship models** (Scrappy Pioneer, Miner, various pirates…).
- **990 tuning constants** with developer comments.

Extracted: `data.cdb` (raw), `cdb_sheets.tsv` (sheet→row counts + columns), `worldgen.json` (systems/sectors/planets/factions/ships), `items.json`, `recipes.json`.

### The big finding for the "live map" question

The earlier assumption was that planet/system/resource positions live only on the server and can't be obtained without packet capture. That's only half true. The world is **procedurally generated from seeds that ship in the client**:

- Each system carries an explicit seed (e.g. *Solar Alpha* → `seed: 55577`).
- Each sector carries deterministic generation rules: system counts (`minSystem`/`maxSystem`), planet attribute weight tables, resource-gen lists, POI counts, etc.
- Constants confirm deterministic placement: `resourcesPlacementSeed`, `SystemMinPlanets`/`MaxPlanets` (2–5), `SystemMinDist` (200), `SystemMinCount`/`MaxCount` (5–15), `SystemSpreadRadiusBegin`, `SystemAngleIncr`.

So the galaxy layout is reconstructable **offline** by reimplementing the generator (it's in `hlboot.dat`) against this data — rather than crawling a live shard. The genuinely server-side, non-derivable part is dynamic state: **player-built bases and live entity positions**. Whether the NA shard differs from others comes down to a single master seed; the generation *rules* are identical across regions (region is just routing — see prefs below).

## hlboot.dat — bytecode signals

String analysis (full dump in `hl_strings.txt`, 104k strings) confirms:

- **Networking = hxbit** (Shiro's binary serializer): 306 `__net_mark` networked-property markers, `NetworkSerializable`, (un)serialize routines → it's a **server-authoritative** model, consistent with the no-offline-mode design.
- Procedural gen classes present client-side: `PlanetGen`, `PlanetGenPrefab`, `PlanetGenKind`.
- Auth/transport: `SteamAuth`, `AES`, `TLS` references → the server link is authenticated and encrypted.
- Endpoints: `data.shirogames.com` (telemetry/data) and the Steam store page.

Full decompilation (recovering the actual generator + the hxbit class schemas) needs a HashLink decompiler — **hlbc** (Rust) or **crashlink** (Python) — pointed at `hlboot.dat`. The bytecode header is intact, so this is straightforward; I just didn't run a full decompiler here.

## prefs.sav & save file

- **`prefs.sav`** is a plaintext Haxe-serialized config with **110 flags**. Includes a developer/debug surface that's present (mostly off) in the shipped client: `admin`, `allowTpRadar`, `forceChangeServer`, `debugCombat`, `debugEditor`, `debugResources`, `galaxyDebugLine`, `planetDebugProps`, `freeCam*`, `dumpBaseInfos`, `serverLog`, `clientNetworkLog`. It also records `region: na` — confirming this client is pinned to the **North America** shard.
- **`save/udat_S50993e.sav`** (86 bytes) is just a tiny session/SID token blob, not world data — consistent with all real state living server-side.
- Localization for 8 languages ships as XML (`extra/lang/...`), extracted alongside.

## Bottom line

Static datamining is essentially **done and clean**: the full asset tree, the complete game database (items, recipes, resources, tech, factions, ships), and the world-generation rules+seeds are all in hand. A wiki/database could be built from `data.cdb` today. The only thing still requiring live interaction is dynamic player/base state — and even the static galaxy map is reconstructable offline by reimplementing the seeded generator from `hlboot.dat`, which avoids the ToS-risky packet-capture path entirely.

*Files in this folder: `DATAMINE_REPORT.md`, `data.cdb`, `cdb_sheets.tsv`, `pak_filelist.tsv`, `worldgen.json`, `items.json`, `recipes.json`.*
