Trade Policy · Large Language Models · ASEAN

A Computational Framework for Comparative Analysis of Free Trade Agreements

An exploration of whether Large Language Models can read FTA legal text well enough to support comparison work: extracting, classifying, and surfacing design differences across three ASEAN-centred agreements.

4,059
provisions extracted
6
LLM classification runs
11
policy categories
3
FTAs compared
0.442
best Macro-F1
The Challenge

The Asia-Pacific region is layered with overlapping Free Trade Agreements, each running to thousands of legal provisions. Comparing how two or three of them treat the same topic is slow, manual work, and the answer often turns on small differences buried deep in the text.

Trade economists call this the spaghetti bowl problem, a tangle of rules and thresholds that differ just enough between agreements to matter for exporters operating under more than one.

The Solution

This project tests whether an LLM-based pipeline can help with that comparison work. It segments the legal text of three ASEAN-centred FTAs into provisions, asks the model to classify each into a policy category, and uses retrieval to draft side-by-side notes on how each agreement handles the same topic.

The output is best read as a first-pass triage layer, useful for spotting where agreements diverge enough to warrant a closer manual read, not a substitute for it. The whole pipeline runs on free-tier APIs and can be pointed at any new FTA PDF.

Three Research Questions

The project is organised around three questions, one per layer of the pipeline. Answers and supporting numbers live on the Findings tab.

RQ1: Classification

Can LLMs reliably classify FTA legal provisions, and how does accuracy change across models and prompt strategies?

RQ2: Policy Design

How do comparable provisions differ across agreements in observable design features such as thresholds, governance structures, and scope?

RQ3: Convergence

Do the agreements show structural convergence or fragmentation in their treatment of key trade policy topics?

How the Pipeline Works
📄
Extraction
7 PDFs parsed with pdfplumber, PyMuPDF, and OCR fallback into 4,059 provisions
🔢
Embedding
Each provision vectorised with MiniLM and stored in ChromaDB for semantic retrieval
🏷️
Classification
LLaMA 3.3 70B and Qwen 3 32B each run zero-shot, few-shot, and CoT strategies
🔍
RAG Comparison
Top provisions per category retrieved and fed to LLM for cross-agreement narrative
Validation
50 hand-labelled provisions used to score all 6 model-strategy combinations
1. AHKFTA Source PDF — OCR Coverage Gap

The AHKFTA source PDF is a fully scanned document. Tesseract OCR extracted the goods-trade chapters (Chapter 2 Trade in Goods, Chapter 3 Rules of Origin, plus Annexes 2-1, 3-1, 3-2, 3-3) reliably, but degraded substantially on the legal-paragraph chapters that follow.

What this means in practice: AHKFTA in fact contains a 14-chapter structure (verifiable from its Table of Contents), but our extracted dataset under-represents the following chapters:

  • Chapter 8 — Trade in Services (with Annex 8-1 Schedules of Specific Commitments)
  • Chapter 10 — Intellectual Property
  • Chapter 13 — Consultations and Dispute Settlement (with Annex 13-1 Rules of Procedure for Arbitral Tribunals)
  • Chapter 6 (Standards), Chapter 7 (Trade Remedies), Chapter 11 (General Provisions and Exceptions), Chapter 12 (Institutional Provisions) — all partially captured at best

Affected findings: Apparent zeros for AHKFTA in Trade in Services, IP, and Dispute Settlement on the Provision Distribution and Convergence pages are extraction artefacts, not real legal absences. The fragmentation entropy scores for IP (0.37) and Dispute Settlement (0.47) are partly driven by this gap.

2. ASEAN-Hong Kong Investment Agreement Not Processed

A separate Agreement on Investment among the Governments of the Hong Kong Special Administrative Region of the People's Republic of China and the Member States of the Association of Southeast Asian Nations was signed on 18 May 2018 as a complementary instrument to the AHKFTA goods agreement. This document was not yet processed in the current pipeline.

Affected findings: The Investment entropy ratio (0.96, currently flagged as "convergent") cannot be interpreted reliably. The 4 AHKFTA provisions tagged as investment-related may include genuine references to the parallel Investment Agreement, but a complete substantive comparison of investment regimes across the three FTAs requires this document to be processed.

3. Validation Gold Set — Small and Single-Annotator

The validation gold set contains only 50 provisions, all labelled by the project author. The author is not a customs lawyer or FTA specialist, and no inter-annotator agreement (κ between two human labellers) was measured.

Statistical implications: With n = 50 and observed accuracy of 0.480, the 95% confidence interval is approximately 0.34 to 0.62. Point estimates of model performance therefore carry substantial uncertainty.

Path to improvement: Expanding to ≥ 200 provisions labelled by ≥ 2 customs-law / FTA experts is the single highest-leverage improvement available. This requires institutional support that one researcher cannot provide alone.

4. Tariff Schedule Annexes Not Table-Extracted

Tariff commitments at the line-item level (HS code, base rate, staging category, phase-out year) live in Annex tariff schedules that are structured as multi-page tables rather than as paragraph text. The current extraction pipeline treats these as text fragments and does not preserve the row/column structure.

Affected findings: Quantitative tariff thresholds, particularly for AANZFTA (which delegates many threshold definitions to product-specific schedules), are under-recovered in the attribute extraction module. The Tariff Commitments category in classification reflects framework provisions, not actual rate schedules.

5. Few-Shot Prompt Bias

The two in-context examples used in the few-shot prompt for the stratified classification run were both goods-trade categories (one Rules of Origin, one Tariff). This biases both models toward goods-related classifications and away from services, investment, and intellectual property.

Affected findings: The high entropy ratio for Rules of Origin (0.97) is partially inflated by this exemplar bias. Future runs should use exemplars balanced across the full target taxonomy.

6. Infrastructure — Personal Laptop, Free-Tier APIs

The entire pipeline runs on a personal MacBook with no GPU. LLM inference uses Groq's free tier, which imposes a rolling 24-hour token budget of ~100,000 tokens per day for LLaMA 3.3 70B. A full Chain-of-Thought classification run on 100 provisions consumes the daily quota in one session.

Implication: Scaling beyond three agreements or running comprehensive sweeps requires either paid API access or institutional inference infrastructure. The current pipeline demonstrates feasibility, not production capacity.

7. Three Agreements, English Only

The corpus covers RCEP, AHKFTA, and AANZFTA in English-language versions only. Findings are suggestive of Asia-Pacific patterns but are not statistically generalisable to the broader regional landscape, which includes the Comprehensive and Progressive Agreement for Trans-Pacific Partnership (CPTPP), the ASEAN-China FTA, the Korea-ASEAN FTA, and others.

What Remains Reliable Despite These Gaps
  • Rules of Origin attribute findings (CC vs CTH, 40% RVC, 10% de minimis): the relevant chapters were extracted reliably across all three agreements.
  • Customs Procedures convergence (entropy 1.00): consistent across all three agreements and consistent with the WTO Trade Facilitation baseline.
  • Pairwise Cohen's κ (0.582–0.702): robust on shared-cohort comparison and methodologically defensible.
  • Tariff Commitments distribution differences: the framework-level patterns are reliable even if line-item rates are not.
  • Pipeline reproducibility: code, data, and gold labels are public; anyone with a Groq API key can reproduce or extend.
FTA
Free Trade Agreement. A treaty between countries that reduces or eliminates tariffs and other trade barriers.
RCEP
Regional Comprehensive Economic Partnership. A 15-party FTA covering ASEAN plus China, Japan, South Korea, Australia, and New Zealand. Signed 2020.
AHKFTA
ASEAN-Hong Kong Free Trade Agreement. A bilateral goods-focused FTA between ASEAN and Hong Kong. In force since 2019.
AANZFTA
ASEAN-Australia-New Zealand Free Trade Agreement. A comprehensive FTA including services, investment, and dispute settlement. Signed 2009.
ASEAN
Association of Southeast Asian Nations. A 10-country regional bloc including Indonesia, Thailand, Vietnam, the Philippines, Singapore, and others.
LLM
Large Language Model. An AI model trained on large amounts of text that can read, classify, summarise, and generate natural language. LLaMA and Qwen are both LLMs.
RAG
Retrieval-Augmented Generation. A technique where relevant text is retrieved from a database and given to the LLM as context before it generates a response, improving factual accuracy.
Rules of Origin (RoO)
Rules that determine whether a product "originates" from a country and thus qualifies for preferential tariff treatment under an FTA. The most rule-intensive area in any goods FTA.
RVC
Regional Value Content. A product qualifies for preferential treatment if at least X% of its value was added within the FTA region. All three agreements in this study use a 40% threshold.
CTC
Change in Tariff Classification. An alternative way to satisfy Rules of Origin. The inputs used in manufacturing must fall under a different tariff code than the finished product, proving substantial transformation occurred.
CTH
Change in Tariff Heading. A CTC rule requiring a change at the 4-digit HS code level (the heading). Used by RCEP. Less strict than CC because transformation within the same chapter is allowed.
CC
Change in Chapter. A CTC rule requiring a change at the 2-digit HS code level (the chapter). Used by AHKFTA. Stricter than CTH because it demands more substantial transformation of the goods.
HS Code
Harmonized System code. An internationally standardised numbering system for traded goods. The 2-digit level is a "chapter," the 4-digit level is a "heading," and the 6-digit level is a "subheading."
NTM
Non-Tariff Measure. A trade barrier other than a tariff, for example import quotas, licensing requirements, or technical standards that effectively restrict trade.
SPS
Sanitary and Phytosanitary Measures. Rules protecting human, animal, and plant health from food-borne risks, diseases, and pests. Governed internationally by the WTO SPS Agreement.
IP
Intellectual Property. Legal rights covering inventions (patents), creative works (copyright), brand names (trademarks), and geographic indicators.
ISDS
Investor-State Dispute Settlement. A mechanism allowing foreign investors to bring legal claims directly against a host government in international arbitration, bypassing domestic courts.
WTO DSU
World Trade Organization Dispute Settlement Understanding. The WTO's own mechanism for resolving trade disputes between member states. AHKFTA references this instead of maintaining its own dispute settlement chapter.
κ, Cohen's Kappa
A statistical measure of agreement between two classifiers that corrects for the agreement you would expect by chance alone. 1.0 means perfect agreement; 0 means no better than chance; negative means worse than chance.
Macro-F1
Macro-averaged F1 score. Measures classification accuracy equally across all categories, including rare ones. More informative than plain accuracy when some categories are much more common than others.
Zero-shot
A prompt strategy where the model is given only the task description and the text to classify. No examples are provided. The model relies entirely on its training.
Few-shot
A prompt strategy where two labelled examples are shown to the model before the new provision. The examples help the model understand the expected format and reasoning.
CoT, Chain-of-Thought
A prompt strategy that instructs the model to reason step-by-step before giving its final answer. Improves Qwen's performance on this task but degrades LLaMA's.
OCR
Optical Character Recognition. Software that converts scanned images of printed text into machine-readable characters. Used as a fallback when PDF text extraction fails.
ChromaDB
An open-source vector database used in this project to store provision embeddings and retrieve semantically similar provisions for the RAG comparison pipeline.
MiniLM
all-MiniLM-L6-v2. A compact sentence embedding model from the sentence-transformers library. Converts text into numerical vectors for semantic similarity search.
Entropy (normalised)
A measure of how evenly distributed provisions are across the three agreements for a given category. A score of 1.0 means all three agreements contribute equally; 0 means one agreement holds everything.
WTO / UNCTAD
World Trade Organization / United Nations Conference on Trade and Development. The two international bodies whose standard FTA chapter taxonomy the 11 classification categories in this project follow.
Total Provisions
4,059
Extracted from 3 FTAs, 7 PDFs
Best Macro-F1
0.442
Qwen 3 32B, Chain-of-Thought
LLM Runs
6
2 models × 3 prompt strategies
Policy Categories
11
WTO/UNCTAD taxonomy
Verdict on Each Research Question
RQ1: Classification
Can LLMs reliably classify FTA legal provisions?
✓ Triage-grade, not provision-level
Accuracy lands between 32% and 48%. Qwen 3 32B with Chain-of-Thought reaches the best Macro-F1 at 0.442. CoT helps Qwen and hurts LLaMA; few-shot prompting hurts both.
RQ2: Policy Design
How do the agreements differ in observable design features?
✓ Same threshold, different transformation rule
RCEP and AHKFTA both use a 40% RVC threshold but RCEP applies CTH at the heading level while AHKFTA applies CC at the chapter level. AANZFTA is the only one with ISDS. Note: AHKFTA's Services, IP, and Dispute Settlement chapters were under-extracted from a scanned PDF — see data limitations.
RQ3: Convergence
Do the agreements converge or fragment?
✓ Procedural converges, substantive does not
Customs Procedures is the only category in genuine sync (entropy 1.00). Dispute Settlement (0.47) and Intellectual Property (0.37) score as fragmented, but this is largely a data artefact — AHKFTA actually has both chapters; OCR under-extracted them from the scanned source.
Three Agreements, Three Very Different Scopes

Each bar shows what share of that agreement's sampled provisions falls into each policy area. AHKFTA's profile is unmistakably different. Almost half of its sample goes to Rules of Origin, and entire categories are missing.

Three Key Findings
Prompt strategy must be chosen per model, not universally
Chain-of-thought prompting raises Qwen's Macro-F1 by 1.8 points but drops LLaMA's by 10.4. Few-shot prompting hurts both. Picking the right strategy is a per-model decision, and the asymmetry only became visible once all six combinations were scored against the same gold set.
⚖️
RCEP eligibility does not guarantee AHKFTA eligibility
RCEP and AHKFTA both use a 40% Regional Value Content threshold. But AHKFTA requires transformation at the chapter level (CC) while RCEP only requires it at the heading level (CTH). On paper the same product can satisfy one rule and not the other, the kind of detail an analyst would want flagged before a closer reading.
🌏
Most apparent fragmentation comes from missing chapters, not opposing rules
Customs Procedures is the only category where all three agreements carry similar shares of provisions, consistent with their shared WTO Trade Facilitation baseline. Dispute Settlement and Intellectual Property appear fragmented in our data, but AHKFTA does in fact contain both chapters (Ch.10 IP, Ch.13 Dispute Settlement) — they were under-extracted from a scanned source PDF. The fragmentation signal is therefore partly an artefact of OCR coverage rather than a real structural absence.
Raw Corpus Size per Agreement
RCEP contributes more than half the corpus simply because it is the largest agreement. The comparative analysis uses a balanced sample of 100 provisions per agreement so that RCEP's size does not drown out the other two.
Headline Numbers
Best Accuracy
LLaMA 3.3 70B, Zero-Shot
0.480
Best Macro-F1
Qwen 3 32B, Chain-of-Thought
0.442
Highest Cohen's κ
LLaMA ZS vs Qwen ZS, n=200
0.702
Most Convergent Category
Customs Procedures
1.00
Most Fragmented Category
Intellectual Property
0.37
RCEP
15 nations · Signed 2020 · In force 2022
20 chapters · 2,171 provisions (53.5%)
Includes: Goods, Services, Investment, IP, Dispute Settlement
AHKFTA
ASEAN + Hong Kong · Signed 2017
362 provisions (8.9%)
Includes: Goods, Services (Ch.8), IP (Ch.10), Dispute Settlement (Ch.13). A separate Agreement on Investment (2018) also exists.
⚠ Source PDF is fully scanned. OCR captured goods/RoO chapters well but under-extracted Services, IP, and Dispute Settlement chapters. Investment Agreement was not yet processed. Coverage figures below understate AHKFTA's actual scope.
AANZFTA
ASEAN + AU + NZ · Signed 2009
1,526 provisions (37.6%)
Includes: Services, Investment (Ch.11), Dispute Settlement, ISDS
Accuracy, % of provisions correctly labelled
LLaMA zero-shot leads on raw accuracy at 0.480, with Qwen CoT close behind at 0.460. No run clears 50%, which is why aggregate patterns are more reliable than any single label.
Macro-F1, fairness-weighted accuracy score
Qwen CoT leads on Macro-F1 even though LLaMA zero-shot beats it on raw accuracy. That gap means Qwen handles the rarer categories better, while LLaMA may be over-favouring the most common ones.
Full Validation Table, all 6 runs ranked
ModelStrategyAccuracyMacro-F1nNote
LLaMA 3.3 70BZero-Shot0.4800.43150Best raw accuracy
Qwen 3 32BChain-of-Thought0.4600.44250🏆 Best Macro-F1
Qwen 3 32BZero-Shot0.3800.42450
Qwen 3 32BFew-Shot0.3800.37350Few-shot hurts Qwen
LLaMA 3.3 70BFew-Shot0.3400.33650
LLaMA 3.3 70BChain-of-Thought0.3200.32750CoT hurts LLaMA
The clearest pattern in this table is that chain-of-thought prompting moves the two models in opposite directions: it helps Qwen and hurts LLaMA. Few-shot prompting hurts both. That asymmetry only became visible after all six combinations were scored against the same gold set.
Key Finding: The Two Models Respond to Prompting in Opposite Directions
Qwen 3 32B, CoT is the best strategy
Zero-shot Macro-F1: 0.424
Few-shot Macro-F1: 0.373 (−5.1 points)
CoT Macro-F1: 0.442 (+1.8 points)
Qwen 3 is a "thinking model": it naturally emits internal reasoning traces. CoT prompting works with that architecture, giving the model space to work through legal text before committing to a label. Few-shot examples appear to anchor it to the wrong prior.
Macro-F1 comparison across strategies
Zero-shot
Few-shot
Chain-of-Thought
LLaMA 3.3 70B, Zero-shot is the best strategy
Zero-shot Macro-F1: 0.431
Few-shot Macro-F1: 0.336 (−9.5 points)
CoT Macro-F1: 0.327 (−10.4 points)
LLaMA's default zero-shot understanding of legal categories is better calibrated than what the prompting strategies produce. Adding examples or reasoning instructions actively degrades its output, possibly because it over-conditions on the in-context content.
Macro-F1 comparison across strategies
Zero-shot
Few-shot
Chain-of-Thought
Pairwise κ, all run pairs on shared provisions
Run ARun Bκn sharedInterpretation
LLaMA zero-shotQwen zero-shot0.702200Substantial, strongest cross-model alignment
Qwen zero-shotQwen few-shot0.689200Substantial, within-model consistency
LLaMA zero-shotLLaMA few-shot0.668200Substantial, within-model consistency
LLaMA CoTQwen CoT0.640100Substantial, CoT aligns both models
LLaMA few-shotQwen few-shot0.582200Moderate, few-shot diverges the models more
The two models agree more with each other on zero-shot than any same-model pair using few-shot or CoT. This means both models share a natural baseline understanding of FTA legal categories, and prompting strategies disrupt that shared signal more than they help it.
What This Means for Reliability

With κ values in the 0.58 to 0.70 range, the two models are substantially consistent with each other when working on the same provisions. They are not labelling at random; they share a meaningful common signal.

The most important finding here is that aggregate category distributions are more trustworthy than individual labels. If both LLaMA and Qwen (zero-shot) assign 48% of AHKFTA provisions to Rules of Origin, that pattern is robust. The value of this pipeline is in answering questions like "what topics does AHKFTA focus on?", not "is this exact provision a Rules of Origin provision?"

The highest within-model κ is for zero-shot pairs, and the highest cross-model κ is also between zero-shot runs. This is the practical recommendation: zero-shot produces the most reproducible and cross-model-consistent results, even if Qwen CoT edges it out on Macro-F1.

Provision Count by Category and Agreement
AHKFTA's bar dominates Rules of Origin while showing zero or near-zero counts in Dispute Settlement, Trade in Services, and Intellectual Property. This pattern is a data artefact, not a real design feature. AHKFTA in fact contains Chapter 8 (Trade in Services), Chapter 10 (Intellectual Property), and Chapter 13 (Consultations and Dispute Settlement). The source PDF is fully scanned, and Tesseract OCR captured the goods-trade chapters well but degraded substantially on the legal-paragraph chapters that follow. The "compression into goods chapters" is therefore extraction loss, not legal scope.
Provision Counts, colour intensity shows concentration
CategoryRCEPAHKFTAAANZFTATotal
Reading down the AHKFTA column, the zeros are missing values, not real absences — they reflect OCR failure on the scanned source for chapters that exist in the agreement but were not successfully extracted. The strongest cells (AHKFTA's 48 Rules of Origin and 31 Tariff Commitments, AANZFTA's 22 Dispute Settlement) are reliable; AHKFTA's zeros in Services, IP, and Dispute Settlement are not.
Three Structural Patterns Worth Highlighting
AHKFTA
Rules of Origin heavy, 48% of its sample, vs 24% for RCEP
In our extracted sample, AHKFTA's provisions concentrate heavily in Rules of Origin and Tariff schedules. The CC transformation rule, stricter than RCEP's CTH, does generate more legal text per topic. Caveat: the sample under-represents AHKFTA's Services, IP, and Dispute Settlement chapters because those sections of the scanned source were not reliably captured by OCR. The actual AHKFTA covers all those areas plus a separate 2018 Investment Agreement that has not yet been processed.
AANZFTA
Dispute Settlement leader, 22 provisions vs 6 for RCEP and 0 for AHKFTA
AANZFTA maintains its own independent dispute settlement mechanism with dedicated adjudication and arbitration procedures. AHKFTA delegates entirely to the WTO DSU and has no chapter of its own. RCEP runs its own mechanism, but fewer of its provisions surfaced in the stratified sample.
RCEP
Broadest coverage, the only agreement with meaningful IP provisions (12)
RCEP's 15-party scope drives it to codify a wide range of policies, and its source PDF is born-digital so extraction was reliable. Its 20 Trade in Services and 12 IP provisions reflect actual chapter coverage. AHKFTA also contains Services and IP chapters in principle, but those were not fully captured from the scanned source — so a clean comparison of scope across all three agreements is not yet possible from this dataset alone.
Entropy Ratio by Category, convergence vs fragmentation
Customs Procedures is the only category where all three agreements are genuinely in sync. Dispute Settlement and Intellectual Property sit at the fragmented end of the entropy chart, but the reason is mostly an extraction artefact: AHKFTA actually has both chapters (Ch.10 IP and Ch.13 Dispute Settlement), and they were under-captured from the scanned source PDF. So the entropy score reads "fragmented" while the underlying legal scope is closer to "all three present, comparable in size."
Convergence Signal Table, raw counts behind each score
CategoryRCEPAHKFTAAANZFTAEntropy RatioSignal
Dispute Settlement shows counts of 6, 0, and 22 across the three agreements. That spread is what drives the fragmentation score. One agreement carries almost all the provisions; another has none. That is a structural scope difference, not a drafting conflict.
What Convergence and Fragmentation Mean for Policy

Convergent: Customs Procedures (1.00). All three agreements allocate almost identical shares of their text to customs procedures. This reflects the WTO Trade Facilitation Agreement providing a shared procedural baseline (documentation, advance rulings, risk-based release) that all three parties follow. This is an area where regional harmonisation has genuinely occurred.

Fragmented: Dispute Settlement (0.47) and Intellectual Property (0.37). These scores are driven almost entirely by AHKFTA having near-zero provisions in our extracted sample for both categories. However, AHKFTA does in fact contain Chapter 10 (Intellectual Property) and Chapter 13 (Consultations and Dispute Settlement) — the Table of Contents of the signed agreement confirms it. The provisions were under-extracted because the AHKFTA PDF is fully scanned and OCR degraded on the legal-paragraph chapters that follow the goods chapters. So the fragmentation signal here is largely a measurement artefact, not a real structural absence.

Caveat on Rules of Origin. Rules of Origin appears convergent (high entropy ratio) but this is partly inflated by the few-shot classification method used. Both in-context examples in the prompt were goods-trade categories, which biases the model toward classifying more provisions as Rules of Origin across all three agreements and makes the distribution appear more even than it actually is.

Caveat on Investment. Investment also scores highly on the entropy ratio. The 4 AHKFTA provisions classified as investment-related may include genuine references — there is in fact a separate Agreement on Investment among Hong Kong and ASEAN signed 18 May 2018 that complements the AHKFTA goods agreement. That separate Investment Agreement was not yet processed in this pipeline, so neither the high entropy nor the low provision count for AHKFTA Investment can be interpreted reliably until it is added. The honest reading is "investment scope cannot be assessed from current data."

Feature Comparison, how each agreement handles key trade policy topics
FeatureRCEPAHKFTAAANZFTA
Tariff Amendment Consensus + formal procedure; unilateral importer notification HS 2012 reference; product-specific exporter choice Unilateral modification rights; selective HS concessions
RoO Governance Committee (Annex 3A/3B); CTH rule Sub-Committee; CC rule, stricter transformation Certificate-based; exporter compliance burden
RVC Threshold 40% 40% Not in main text, delegated to schedules
CTC Rule CTH, 4-digit heading level. Easier to satisfy. CC, 2-digit chapter level. More restrictive. Not recovered in main text
Dispute Settlement Own mechanism; independent of WTO DSU WTO DSU as reference, no independent mechanism Own mechanism; adjudication + arbitration
Investment Dedicated Ch.10; national treatment No dedicated chapter; 4 provisions classified as investment-adjacent (likely misclassified general commercial provisions) Dedicated Ch.11; national treatment; ISDS
Customs Direct consignment requirements No legalisation / authentication required Risk-based clearance for low-risk goods
Services Mode 1 to 4; schedule of commitments Not present Mode 1 to 4; schedule of commitments
The CTC Rule row is where the comparison gets practically interesting. RCEP and AHKFTA both use a 40% RVC threshold, which makes them look interchangeable at first glance. Look one column down and the transformation requirement diverges. Chapter-level for AHKFTA, heading-level for RCEP, which means a product can in principle satisfy one rule and not the other.
Why the CC vs CTH Difference Matters

RCEP and AHKFTA both use a 40% RVC threshold. An exporter looking at just that number would conclude they meet the Rules of Origin requirements under both agreements. But the CTC rule, the other requirement, differs in a practically important way:

RCEP, CTH (Heading level, 4-digit)
The inputs and the finished product must fall under different 4-digit HS headings. You are allowed to use inputs from the same 2-digit chapter, as long as they end up at a different heading. Easier to satisfy, more flexible for manufacturers.
AHKFTA, CC (Chapter level, 2-digit)
The inputs and the finished product must fall under different 2-digit HS chapters. This is a much stricter test, it requires the manufacturing process to involve a genuine change in the nature of the goods at the chapter level.
What this implies
On the LLM-extracted text, a product that meets the 40% RVC threshold and clears CTH under RCEP would not automatically clear AHKFTA's stricter CC rule. That is a difference worth flagging for closer manual review rather than a verdict on any specific shipment. The dashboard’s role is to surface the gap, not to score it.
About these AI-generated comparisons
Each category below was analysed by retrieving the most relevant provisions from all three agreements and asking the model to write a structured comparison. The output is richer than a single-provision classification because the model reads primary text before responding. That said, these are AI-generated summaries and any specific legal conclusions should be checked against the original agreement text.