データ資産化 — Provenance / Source Weight / Owner 承認フロー Data asset — Provenance / Source weight / Owner approval flow
Parky は駐車場データを 11 種類のソース(owner / admin / field / ugc / akippa / navitime / google / osm / kokudo / jichitai / import)から取り込みます。Phase 0 (2026-05-04 dev DB apply 済) で field 単位の provenance モデルを導入し、駐車場の各属性ごとに「誰が・いつ・どこから」入れた値かを記録できる構造に切り替えました。本ページは社内エンジニア向けに、provenance / source_weight / Owner 承認フロー / Raw landing zone / ハウスキーピング / バックアップを横断的にまとめます。
Parky ingests parking-lot data from 11 sources (owner / admin / field / ugc / akippa / navitime / google / osm / kokudo / jichitai / import). Phase 0 (applied to dev on 2026-05-04) introduced field-level provenance: every attribute is recorded with who supplied it, when, and from which source. This page is an internal engineering reference covering provenance, source_weight, the owner approval flow, the raw landing zone, housekeeping, and backups.
infra/supabase/migrations/2026-05-04_parking_data_asset_phase0_README.md と Phase 0c/0d/0e/0g の各 SQL ヘッダコメントを参照。
バックアップ運用は ops/db-backup-runbook。
See infra/supabase/migrations/2026-05-04_parking_data_asset_phase0_README.md and the SQL header comments for Phase 0c/0d/0e/0g.
Backup ops live in ops/db-backup-runbook.
1. 全体構造1. Architecture overview
Phase 0 では parking_lots テーブルを VIEW にリネーム し、その背後に canonical identity テーブル parking_lots_master + provenance ログ parking_field_values を置きました。INSTEAD OF triggers で透過動作させているため、API 層の SQL (FROM public.parking_lots) は無改修で動きます。
Phase 0 turned parking_lots into a VIEW backed by a canonical identity table (parking_lots_master) and a provenance log (parking_field_values). INSTEAD OF triggers make this transparent — existing API SQL (FROM public.parking_lots) keeps working unchanged.
flowchart LR
subgraph CLIENT["Client / API"]
A[FROM public.parking_lots]
end
subgraph VIEW["VIEW layer (透過)"]
V["parking_lots (VIEW)
INSTEAD OF triggers"]
end
subgraph CORE["Core tables"]
M["parking_lots_master
canonical identity
(id / code / status /
location / shape_type)"]
F["parking_field_values
provenance log
(field_name / source /
captured_at / priority_score)"]
C["parking_field_confirmations
UGC 補強"]
end
subgraph SEARCH["Search layer"]
S["parking_search_v (MV)
field_values primary を pivot"]
end
subgraph INGEST["Raw landing zone"]
R["parking_ingest_runs"]
P["parking_ingest_payloads"]
end
A ==> V
V -->|read| M
V -->|read primary| F
V -->|INSTEAD OF INSERT/UPDATE| M
V -->|INSTEAD OF INSERT/UPDATE| F
F --> C
M --> S
F --> S
R --> P
P -->|upsert_ingest_payload| F
2. parking_field_values テーブル2. The parking_field_values table
Phase 0 の中心。(parking_lot_id, field_name, source, captured_at) を 1 行として、駐車場 1 属性につき複数 source の値が時系列で蓄積されます。
The heart of Phase 0. One row per (parking_lot_id, field_name, source, captured_at) tuple — multiple values per attribute, accumulated chronologically.
| 列 | 役割 | Purpose |
|---|---|---|
parking_lot_id | 対象駐車場 (FK to parking_lots_master) | Target lot (FK to parking_lots_master) |
field_name | 属性名 (Phase 0: name / address / total_spaces / structure / entry_method / receipt_available / pricing_notes / vehicle_type_max。Phase 0.5 追加: contact_phone / contact_email / contact_url / capacity_disabled / capacity_kei / capacity_regular / capacity_oversized / capacity_motorcycle / capacity_ev) | Attribute name (Phase 0: name / address / total_spaces / structure / entry_method / receipt_available / pricing_notes / vehicle_type_max. Phase 0.5 added: contact_phone / contact_email / contact_url / capacity_disabled / capacity_kei / capacity_regular / capacity_oversized / capacity_motorcycle / capacity_ev) |
value | jsonb (string / number / boolean any) | jsonb (string / number / boolean / any) |
source | data_source enum (11 種) | data_source enum (11 sources) |
contributor_id | 入稿元 user / admin / surveyor の uuid (匿名 ugc は NULL) | Submitter (user / admin / surveyor uuid; NULL for anonymous UGC) |
captured_at | 入稿時刻 | Submission timestamp |
confirms_n | 他ソースが同値で補強した回数 (UGC で重要) | Times this value was confirmed by other sources (matters for UGC) |
is_primary | この (lot, field) で公開される値か (1 行のみ true) | Whether this is the primary value exposed for the (lot, field) (only one row) |
priority_score | source_weight + recency + confirms から計算 (§4 参照) | Computed from source_weight + recency + confirms (see §4) |
approved_at | owner / admin 承認の時刻 (未承認 = NULL) | Approval timestamp (NULL = unapproved) |
approved_by | 承認した admin の uuid | Approving admin uuid |
UNIQUE INDEX idx_pfv_primary ON (parking_lot_id, field_name) WHERE is_primary AND deleted_at IS NULL で、同 (lot, field) で primary が 1 行のみであることを物理保証します。
A partial UNIQUE INDEX (idx_pfv_primary) physically enforces "one primary row per (lot, field) where not deleted."
3. data_source ENUM と source_weight3. data_source ENUM and source_weight
どのソース由来の値を信頼するかは source_weight で表現します。実装は public.calc_priority_score(source data_source, captured_at timestamptz, confirms_n int) RETURNS double precision 関数。
How much we trust each source is encoded in source_weight, materialized in public.calc_priority_score(source, captured_at, confirms_n) RETURNS double precision.
| source | weight | 由来 | Origin | 承認要否 | Approval |
|---|---|---|---|---|---|
field | 1.00 | 実地調査 | Field survey | 不要 (現地確認済) | none (already verified) |
owner | 0.90 | オーナー入稿 | Owner submission | active 中の編集は admin 承認後 | active lots: admin approval required |
admin | 0.85 | 運営手動編集 | Manual ops edit | 即時 (権限内) | immediate |
ugc | 0.60 | ユーザー投稿 | User-generated content | confirms_n で重み付け | weighted by confirms_n |
akippa | 0.50 | 提携 API | Partner API | 自動承認 | auto-approved |
navitime | 0.50 | 提携 API | Partner API | 自動承認 | auto-approved |
jichitai | 0.45 | 自治体オープンデータ | Municipal open data | 自動承認 | auto-approved |
kokudo | 0.40 | 国土数値情報 | National geographic data | 自動承認 | auto-approved |
google | 0.35 | Google Places | Google Places | 自動承認 | auto-approved |
osm | 0.30 | OpenStreetMap | OpenStreetMap | 自動承認 | auto-approved |
import | 0.25 | Phase 0 マイグレーション由来 (旧 parking_lots からの shadow record) | Migrated from Phase 0 baseline (shadow records from old parking_lots rows) | 遡及承認 | retrospectively approved |
4. priority_score の計算4. priority_score calculation
calc_priority_score は 3 要素を線形合成します。係数は dev DB 側で調整可能 (将来 displacement batch でチューニング)。
calc_priority_score linearly combines three signals. Coefficients are tunable on dev DB (will be tuned via the displacement batch later).
priority_score
= 1.00 * source_weight(source) -- 0.25 .. 1.00
+ 0.30 * recency_factor(captured_at) -- 0.0 .. 0.3 (90 日 half-life の指数減衰)
+ 0.15 * confirm_factor(confirms_n) -- 0.0 .. 0.15 (log scale, cap 5)
同点処理: priority_score 同点は captured_at DESC で勝負。それでも引き分けなら最後に書き込んだ行が primary に。
Tie-breaks: ties on priority_score fall back to captured_at DESC; further ties go to the most recent insert.
5. is_primary の自動再計算5. is_primary auto-recompute
parking_field_values への INSERT / UPDATE で trg_pfv_recompute_primary が発火し、対象 (lot, field) の中から 承認済 (approved_at IS NOT NULL) かつ priority_score 最大の 1 行を is_primary=TRUE、それ以外を FALSE に書き換えます。未承認の owner / admin レコードは候補から除外されるため、Owner Portal で PATCH しても admin 承認まで公開値は変わりません。
Insert / update on parking_field_values triggers trg_pfv_recompute_primary, which picks the highest priority_score among approved rows and flips it to is_primary=TRUE (others to FALSE). Unapproved owner / admin rows are skipped — so an owner PATCH never moves the public value until admin approves.
6. Owner 承認フロー (parking_lots VIEW INSTEAD OF triggers)6. Owner approval flow (parking_lots VIEW INSTEAD OF triggers)
Owner 編集の挙動は 駐車場 status と caller の権限で分岐します。INSTEAD OF triggers (parking_lots_view_insert/update/delete) と provenance helper (parking_lots_view_provenance) で実装。
Owner edit behavior branches on lot status and caller role. Implemented in INSTEAD OF triggers (parking_lots_view_insert/update/delete) plus a provenance helper (parking_lots_view_provenance).
| caller | Caller | status | 挙動 | Behavior |
|---|---|---|---|---|
is_admin() = true |
(any) | source='admin', approved_at=now() で即時公開 | source='admin', approved_at=now() — published immediately | |
| owner | active |
source='owner', approved_at=NULL でキューに入る (admin 審査待ち) | source='owner', approved_at=NULL — queued for admin review | |
| owner | draft / pending |
source='owner', approved_at=now() で即時反映 (作業中) | source='owner', approved_at=now() — applied immediately (still being drafted) | |
admin が status: pending → active |
— | トリガ trg_parking_lot_approve_owner_data が当該駐車場の全 owner 未承認レコードを一括承認 |
Trigger trg_parking_lot_approve_owner_data bulk-approves every queued owner row on that lot |
|
申請中の field を Owner Portal で表示するため、GET /v1/owner/parking-lots/mine/{lotId}/pending-fields endpoint が source='owner' AND approved_at IS NULL AND deleted_at IS NULL の行を返します。Portal 側は PublicationTab で「N 件の修正が運営審査中」と提示します。
For Owner Portal visibility, GET /v1/owner/parking-lots/mine/{lotId}/pending-fields returns rows with source='owner' AND approved_at IS NULL AND deleted_at IS NULL. The portal renders "N edits awaiting admin review" in PublicationTab.
7. Raw landing zone (Phase 0d)7. Raw landing zone (Phase 0d)
外部 API (akippa / navitime / google / osm 等) から取り込んだ生データを parking_field_values へ流し込む前段に parking_ingest_payloads として保管します。失敗時のリトライ、過去版の参照、新フィールド抽出のリプレイに使います。
External API ingests (akippa / navitime / google / osm / ...) land in parking_ingest_payloads before being projected into parking_field_values. Used for retry, historical lookup, and replay when extracting new fields.
parking_ingest_runs— 1 batch 取り込みの metadata (source / started_at / finished_at / counts)parking_ingest_payloads— 1 件の raw payload (jsonb) + hash ベースの dedup + supersede chainbff_only.upsert_ingest_payload(...)— 取り込みの単位 RPC (run_id / external_id / hash で冪等)bff_only.start_ingest_run / finish_ingest_run— run lifecycleparking_ingest_payloads_active(VIEW) — superseded されていない最新版だけを返す
8. ハウスキーピング (Phase 0e)8. Housekeeping (Phase 0e)
Raw landing zone は何もしないと無限に肥大化するため、リテンション + プルーンを pg_cron で自動運用します。
The raw zone would grow unbounded — pg_cron handles retention + pruning.
| 関数 | Function | 役割 | Purpose | cron |
|---|---|---|---|---|
public.prune_ingest_payloads(retention_days, min_versions, dry_run, triggered_by) | superseded 版を保持期間後に hard delete (直近 N 版は確保) | Hard-delete superseded payloads after retention (keep last N) | Sat 19:00 UTC (Sun 04:00 JST) | |
public.prune_ingest_runs(retention_days) | 古い run metadata を削除 (FK cascade なし、子 payload とは別管理) | Drop old run rows (no FK cascade; child payloads are managed separately) | (同上) | |
public.purge_soft_deleted_payloads(grace_days) | deleted_at から N 日経過した soft-delete 行を物理削除 | Hard-delete soft-deleted rows older than N days | (同上) | |
全ハウスキーピング実行は infra.parking_ingest_housekeeping_log に行き先 (rows_examined / rows_deleted / parameters / triggered_by) と一緒に記録され、infra.parking_ingest_storage_v view でテーブルサイズと併せて監視できます。
Each housekeeping run is logged in infra.parking_ingest_housekeeping_log (rows_examined / rows_deleted / parameters / triggered_by). Combine with infra.parking_ingest_storage_v for table-size monitoring.
9. 検索層 (parking_search_v MV)9. Search layer (parking_search_v MV)
地図検索 / 一覧 API は parking_search_v (MATERIALIZED VIEW) を読みます。parking_lots_master + parking_field_values の primary を pivot した形で、ratings / hourly_rate も JOIN 済。setup_parking_search_v_refresh_cron() で pg_cron に REFRESH MATERIALIZED VIEW CONCURRENTLY が登録されます。
Map search and list APIs read parking_search_v (MATERIALIZED VIEW): a pivot of parking_lots_master + the primary parking_field_values, pre-joined with ratings and hourly_rate. setup_parking_search_v_refresh_cron() registers a pg_cron REFRESH MATERIALIZED VIEW CONCURRENTLY.
10. バックアップ (Phase 0g)10. Backups (Phase 0g)
Phase 0f (per-row R2 archive) は over-engineered として revert し、代わりに pg_dump → R2 → Wasabi の 3 拠点バックアップに統一しました。
Phase 0f (per-row R2 archive) was reverted as over-engineered, replaced with a unified pg_dump → R2 → Wasabi 3-vendor backup.
- Xserver cron 02:30 JST で
scripts/ops/backup-supabase-to-r2.sh実行 →db/parky-{prod|dev}/YYYY/MM/YYYY-MM-DD.dumpとして R2 (parky-archivebucket) に PUT - 03:00 JST で
scripts/ops/backup-r2-to-wasabi.shが R2 → Wasabi (versioning ON / noncurrent 90 日 expire) に rclone copy - 復元は
scripts/db/restore-supabase-from-r2.sh --env prod --date 2026-04-01 --dry-runで dump 取得、本番 restore は別 DB へ
詳細運用は ops/db-backup-runbook.html。
Operational details in ops/db-backup-runbook.html.
11. 表で覚える Phase 0 の構成物11. Phase 0 inventory at a glance
| Phase | 追加されたもの | Added | 主目的 | Goal |
|---|---|---|---|---|
| 0a/0b | parking_lots → VIEW rename / parking_lots_master / parking_field_values / INSTEAD OF triggers / RLS | parking_lots VIEW rename / parking_lots_master / parking_field_values / INSTEAD OF triggers / RLS | field 単位 provenance の core | Core of field-level provenance |
| 0c | parking_field_confirmations / increment_confirms_n / run_displacement_batch | parking_field_confirmations / increment_confirms_n / run_displacement_batch | UGC confirms 補強と priority_score 再計算 | UGC confirmation + priority_score recompute |
| 0d | parking_ingest_runs / parking_ingest_payloads / upsert_ingest_payload | parking_ingest_runs / parking_ingest_payloads / upsert_ingest_payload | 外部取り込みの raw 永続化 (replay 可) | Persist raw external ingests (for replay) |
| 0e | prune_ingest_payloads / prune_ingest_runs / purge_soft_deleted_payloads / housekeeping_log | prune_ingest_payloads / prune_ingest_runs / purge_soft_deleted_payloads / housekeeping_log | retention / プルーンの自動化 | Automated retention / pruning |
| 0f | over-engineered と判断 | Reverted as over-engineered | ||
| 0g | backup-supabase-to-r2.sh / backup-r2-to-wasabi.sh / restore-supabase-from-r2.sh + Xserver cron | backup-supabase-to-r2.sh / backup-r2-to-wasabi.sh / restore-supabase-from-r2.sh + Xserver cron | DB の長期 3 拠点バックアップ | Long-term 3-vendor DB backup |
| 0.5 | codes 拡張 (lot_structure 10 値 / lot_entry 8 値 / lot_type 6 値) / parking_field_values の field_name 8→17 / parking_operators テーブル新設 + master.operator_id FK / tags 14 個追加 (vehicle / accessibility / 監視員 / 照明) / parking_spots に EV 詳細 5 列 | codes expansion (lot_structure 10 / lot_entry 8 / lot_type 6) / parking_field_values field_name 8→17 / new parking_operators table + master.operator_id FK / 14 tags (vehicle / accessibility / supervisor / lighting) / parking_spots EV detail 5 columns | Phase 1 (OSM / UGC) 着手前の駐車場データモデル整地 — 監査レポート (.work/output/parky/2026-05-04_008) の High 5 件 + Mid 7 件を解消 | Pre-Phase-1 schema tidy-up — addresses High 5 + Mid 7 gaps from audit (.work/output/parky/2026-05-04_008) |
12. 参考: 主要 RPC 一覧12. Reference: key RPCs
| schema.function | 役割 | Purpose |
|---|---|---|
public.calc_priority_score(source, captured_at, confirms_n) | priority_score 計算 (provenance 評価) | Compute priority_score from provenance |
public.run_displacement_batch(...) | priority_score を全件再計算 + primary 入替 (Phase 1 で recency 改良予定) | Recompute priority_score + flip primaries (recency tuning planned in Phase 1) |
public.prune_ingest_payloads(retention_days, min_versions, dry_run, triggered_by) | superseded 版の hard delete (直近 N 版確保) | Hard-delete superseded payloads (keep last N versions) |
public.prune_ingest_runs(retention_days) | 古い run の削除 | Drop old runs |
public.purge_soft_deleted_payloads(grace_days) | soft-delete からの確定削除 | Hard-delete from soft-deleted state |
public.setup_parking_search_v_refresh_cron() | parking_search_v REFRESH の pg_cron 登録 | Schedule parking_search_v REFRESH via pg_cron |
public.setup_displacement_batch_cron() | displacement batch の pg_cron 登録 | Schedule the displacement batch via pg_cron |
public.setup_ingest_housekeeping_cron() | ハウスキーピング系 3 関数の pg_cron 登録 | Schedule the 3 housekeeping functions via pg_cron |
bff_only.upsert_ingest_payload(...) | 外部取り込みの単位 RPC (hash 冪等) | Per-record ingest RPC (hash-idempotent) |
bff_only.start_ingest_run / finish_ingest_run | 取り込み batch の lifecycle | Ingest batch lifecycle |
bff_only.increment_confirms_n(field_value_id) | UGC confirm の単位 RPC | UGC confirm increment RPC |
13. Phase 1 以降 (スコープ外 / 参考)13. Future phases (out of scope, for reference)
- Phase 1: OSM Overpass 取り込み / UGC submit/confirm API / displacement batch の recency 改良
- Phase 1: OSM Overpass ingest / UGC submit/confirm API / recency tuning in displacement batch
- Phase 2: NAVITIME / akippa 連携 / ownership_rate ダッシュボード / 実地調査ファイル取り込み
- Phase 2: NAVITIME / akippa partnerships / ownership_rate dashboard / field survey ingest
- Phase 3: B2B データ提供 (OSM 派生は除外、Parky 所有データのみ)
- Phase 3: B2B data delivery (OSM-derived data excluded, only Parky-owned)
Phase 0 で provenance / source_weight / Owner 承認フロー / Raw landing zone / ハウスキーピング / バックアップが揃ったため、Phase 1 以降は 新ソースの追加と新 ingestion script の実装 だけで済むはず、という設計判断になっています。
With Phase 0 in place — provenance, source_weight, owner approval flow, raw landing zone, housekeeping, backups — later phases should reduce to adding new sources and new ingestion scripts only.