Extraction Export Library #
@bonsai/extraction-export is an independent TypeScript library that converts extracted document data (invoices, bank statements, direct expenses) into 27+ accounting software formats. It was extracted from apps/webapp/src/shared/lib/export into libs/typescript/extraction-export/ to enable reuse across services and enforce a clean dependency boundary.
Architecture #
Directory Layout #
libs/typescript/extraction-export/
├── build.ts # Bun bundler configuration
├── package.json # Scripts, exports, dependencies
├── tsconfig.json # TypeScript config (noEmit for dev)
├── tsconfig.build.json # TypeScript config (declaration-only for build)
├── scripts/
│ └── generate-metadata.ts # Codegen: ts-morph → export-format-metadata.tsx
└── src/
├── index.ts # Server entry point (full exports)
├── client.ts # Client entry point (no Node.js deps)
├── format-converters.tsx # Base types: FormatConverter, ExportFormat, enriched types
├── ap-bill-format-converters.tsx # AP Bill converters (22 formats)
├── ar-invoice-format-converters.tsx # AR Invoice converters (15 formats)
├── bank-statement-format-converters.tsx # Bank Statement converters (8 formats)
├── direct-expense-format-converters.tsx # Direct Expense converters (3 formats)
├── export-format-metadata.tsx # Auto-generated (gitignored) — client-safe metadata
├── export-utils.ts # Export orchestration: enrichment, validation, archiving
├── direct-expense-types.ts # Enriched direct expense types
├── logger.ts # Pluggable logger (configureLogger)
├── icons/ # React SVG icons for accounting software logos
├── types/ # API types (Invoice, BankStatement, etc.)
└── utils/ # Shared utilities (round, StrictSubset)
Key Modules #
| Module | Purpose |
|---|---|
format-converters.tsx |
Base interfaces (FormatConverter, ExportFormat), enriched types, ExportFormatType union, formatDate utility |
ap-bill-format-converters.tsx |
All AP Bill converter classes (QBO, Xero, AutoCount, MYOB, Sage, etc.) |
ar-invoice-format-converters.tsx |
All AR Invoice converter classes |
bank-statement-format-converters.tsx |
All Bank Statement converter classes |
direct-expense-format-converters.tsx |
All Direct Expense converter classes (Zoho, MYOB, SQL Accounting) |
export-utils.ts |
High-level functions: enrichInvoicesWithAccountingData, exportInvoices, createMultiFileExport, validation |
export-format-metadata.tsx |
Auto-generated — lightweight metadata (id, name, icon, disabled) for client-side UI format selectors |
logger.ts |
Pluggable logger — webapp bridges its Sentry/Datadog logger via configureLogger() |
Dual Entry Points #
The library provides two entry points to separate server-side and client-side concerns:
@bonsai/extraction-export (server)
#
Full library with all converter classes, export utilities, and Node.js dependencies (archiver, node-xlsx). Use this in API routes and server-side code.
import {
exportInvoices,
getApBillConverter,
configureLogger,
} from '@bonsai/extraction-export';
@bonsai/extraction-export/client (client-safe)
#
Types, metadata getters, icons, and lightweight values only. No archiver or node-xlsx — safe for browser bundles.
import {
getApBillExportFormatMetadata,
configureAssetBaseUrl,
type ApBillExportFormatType,
} from '@bonsai/extraction-export/client';
The client entry point exports:
- All types (type-only re-exports from converter modules)
- Export format metadata getters (from the auto-generated
export-format-metadata.tsx) - Icons and
configureAssetBaseUrl - Logger, utility functions, and enums
How Webapp Consumes the Library #
The webapp configures the library at import time through two config files:
export-config.ts— Sets the R2 CDN base URL for format icons:configureAssetBaseUrl('https://assets.gotofu.com')export-config.server.ts— Bridges the webapp’s structured logger:configureLogger(logger as Logger)
Server-side API routes (/api/export-invoices, /api/export-bank-statements, etc.) import from @bonsai/extraction-export. Client-side UI components import from @bonsai/extraction-export/client.
Build System #
Build Pipeline #
The build runs two stages:
pnpm build
# Which executes: pnpm codegen && bun run build.ts
-
Codegen (
pnpm codegen): Runsscripts/generate-metadata.tswhich uses ts-morph to parse the 4 converter files, extract lightweight metadata (id, name, icon, disabled) from each converter class, and writessrc/export-format-metadata.tsx. This file is gitignored and regenerated on every build. -
Bundle (
bun run build.ts):- Cleans
dist/ - Builds ESM bundles with Bun for both entry points (
index.ts,client.ts) - Externalizes
react,archiver,date-fns,node-xlsx - Generates source maps
- Runs
tsc --project tsconfig.build.jsonto emit.d.tsdeclaration files - Runs
tsc-aliasto resolve@/path aliases in declarations
- Cleans
Postinstall #
"postinstall": "[ -d dist ] || pnpm build"
Builds the library on pnpm install if dist/ doesn’t exist. This ensures the library is available to consumers without an explicit build step in CI and local development.
Codegen Details #
scripts/generate-metadata.ts uses ts-morph to:
- Parse the 4 converter source files (
ap-bill-format-converters.tsx, etc.) - Walk class hierarchies to find
format,name,icon, anddisabledproperties - Generate a single
export-format-metadata.tsxwith typed metadata arrays and getter functions - Format the output with Biome
Each section is configured with a SectionConfig specifying the source file, type name, and output variable name. Bank statement formats are sorted alphabetically; others preserve source order.
To verify codegen is up to date:
pnpm codegen:check
# Runs codegen, then git diff --exit-code to detect uncommitted changes
Format Converter Types #
The library supports 4 document types, each with its own set of accounting software converters:
AP Bill (Accounts Payable) #
22 formats including: QBO, QBO Desktop, Xero, AutoCount, AutoCount Cloud, Bukku, FAST, FAST Voucher, Winton, MYOB, Sage, Odoo, Freee, Yayoi, SAP, Zoho Bill, QNE (Past Bills & Purchase Invoices), PEAK, ThaiTax, SQL Accounting, AccountingAI
AR Invoice (Accounts Receivable) #
15 formats including: QBO, QBO Desktop, Xero, AutoCount, Bukku, FAST, Winton, MYOB, Sage, QNE (Past Invoices & Sales Invoices), Zoho Sales Invoice, PEAK, SQL Accounting, AccountingAI
Bank Statement #
8 formats including: QBO, Xero, AutoCount, Bukku, Odoo, Zoho, PEAK, AccountingAI
Direct Expense #
3 formats: Zoho Expense, MYOB Direct Expense, SQL Accounting
Converter Interface #
Each converter implements a consistent interface:
interface FormatConverter {
format: ExportFormatType; // e.g. 'qbo', 'xero'
headers: string[]; // CSV/XLSX column headers
convertInvoice: (invoice: EnrichedInvoice) => (string | number | Date | null)[][];
getFileExtension: () => string; // 'csv' or 'xlsx'
getContentType: () => string; // MIME type
getExportWithMetadata?: ( // Optional: custom export with metadata rows
invoices: EnrichedInvoice[],
metadata?: ExportMetadata
) => string | Buffer;
}
Bank statement and direct expense converters follow the same pattern with type-appropriate methods (convertBankStatement, convertDirectExpense).
Mise Tasks #
All tasks are defined in libs/typescript/.tasks.toml:
| Task | Command | Description |
|---|---|---|
mise ts-lib-build |
pnpm build |
Run codegen + Bun bundler + tsc declarations |
mise ts-lib-codegen |
pnpm codegen |
Generate export-format-metadata.tsx |
mise ts-lib-check |
Runs lint + format + typecheck in parallel | All quality checks |
mise ts-lib-lint-check |
pnpm lint (oxlint) |
Lint source files |
mise ts-lib-format-check |
pnpm format (biome) |
Check formatting |
mise ts-lib-format-fix |
pnpm format:fix |
Auto-fix formatting |
mise ts-lib-type-check |
pnpm typecheck (tsc –noEmit) |
Type checking |
CI/CD #
The library is built in CI through two mechanisms:
ci.yml: Thepostinstallscript handles it —pnpm installtriggers[ -d dist ] || pnpm build, building the library ifdist/is absent.deploy.yaml: Explicitly sets up mise and runsmise ts-lib-buildto ensure a fresh build before deployment.
Development Workflow #
Local Development #
# Build the library (codegen + bundle + declarations)
mise ts-lib-build
# Run all quality checks
mise ts-lib-check
# Fix formatting issues
mise ts-lib-format-fix
# Regenerate metadata after modifying converters
mise ts-lib-codegen
# Type-check only
mise ts-lib-type-check
After Modifying Converters #
When you add or change a converter class, you must regenerate metadata:
mise ts-lib-codegen # Regenerates export-format-metadata.tsx
mise ts-lib-build # Full rebuild including codegen
The postinstall hook rebuilds automatically when dist/ is missing, but during development you should run explicit builds after changes.
Key Conventions #
File Formats #
- CSV (
text/csv): Most formats export as CSV. Headers are the first row, followed by one row per line item. - Excel (
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet): Some formats (FAST, FAST Voucher, SAP, QNE, SQL Accounting, MYOB Direct Expense) export as.xlsxusingnode-xlsx. These usegetExportWithMetadatafor custom sheet layouts.
Date Formatting #
Each accounting software expects dates in specific formats. The formatDate(dateMs, format) utility wraps date-fns/format. Common patterns:
dd/MM/yyyy— Default, Xero, AutoCountMM/dd/yyyy— QBOyyyy-MM-dd— Zoho, Odoo, Sage, SAP, PEAK
Tax Calculations #
- Converters respect
InvoiceDataLineAmountType(INCLUSIVE vs EXCLUSIVE) to determine how tax is calculated - Tax amounts are extracted from enriched line items (
accounting_tax_rate) - Rounding uses the
round()utility fromutils/number.ts
Enriched Data #
Before export, invoices and bank statements are enriched with accounting data (contacts, accounts, tax rates, tags, items, locations) via enrichInvoicesWithAccountingData and enrichBankStatementsWithAccountingData in export-utils.ts. The enriched types (EnrichedInvoice, EnrichedInvoiceLineItem, etc.) extend the base API types with optional accounting fields.
Validation #
Export utilities include validation functions (validateInvoiceForExport, validateBankStatementForExport, validateDirectExpenseForExport) that check for required fields before export.
Related Documentation #
- Webapp Documentation — Frontend that consumes this library
- API Documentation — Backend API providing the data
- Development Workflow — Branching strategy and PR process