CogniDev / Validation Walkthrough ← Back to cognidev.ai
Validation walkthrough · TIBCO BusinessWorks → AWS

How CogniDev migrates a TIBCO project to AWS — and how you verify every line of it.

This document walks through five concrete file pairs from a representative TIBCO BusinessWorks project (the source) and the AWS artifacts CogniDev emits (the target). For each pair, you see the exact source construct, the deterministic rule that fired, the target artifact, and a line-level lineage proof. The goal is to make every claim on cognidev.ai testable on real files.

The core claim, stated precisely: The structural translation — control flow, activity-to-service mapping, IaC, lineage — is performed by a versioned rule library, not by an LLM. The LLM has five bounded roles (naming, comments, test fixtures, long-tail body proposals, reviewer narratives); every LLM output passes a deterministic contract gate before it touches a generated artifact. Strip the LLM and the migration still compiles; the artifacts get harder to read, not less correct.

The deterministic cycle

Every migration runs through six phases. Each phase emits a signed artifact that the next phase reads. Nothing is implicit; everything is on disk and re-runnable.

  1. 01
    Parse

    Read every .bwp (process), .process, .substvar (substitution variables), .xpdl, .wsdl, .xsd, adapter config, JMS destination, JDBC binding. Build a typed AST per file. Output: parse-tree.json.

  2. 02
    Graph

    Resolve transitions, sub-process calls, JMS topic flows, shared resources. Build the project-wide DAG. Output: dependency-graph.json with every activity as a node and every transition / message flow as an edge.

  3. 03
    Classify

    Walk the DAG and tag idiomatic shapes (linear, iterate, parallel, choice, request-reply, file-poll, claim-check, dead-letter). Tag transactional boundaries (XA, local). Output: classification.json.

  4. 04
    Map

    For every node and every shape, look up the rule in the published rule library. Each rule has an ID, a version, an input schema (which BW node it accepts) and an output schema (which AWS artifact it produces). Output: rules-applied.csv — one row per node.

  5. 05
    Emit

    Each rule emits its target artifact: Step Functions ASL, Lambda handler stubs, CDK / Terraform IaC, IAM policies, Amazon MQ / SQS / SNS / MSK configs, RDS schemas, JSONata mappers. Output: a complete AWS project tree.

  6. 06
    Verify

    Schema-in / schema-out equivalence between source and target. Sample-message replay where source endpoints are reachable. Static analysis on the emitted code. Output: verification-report.json + the signed evidence pack.

The LLM boundary

The LLM is not a translator. It is a writer with a strictly bounded scope, and every output it produces is checked by a deterministic gate before it is allowed into a generated artifact. This is the part of the system that lets us claim "deterministic" honestly.

LLM TOUCHES

Five bounded roles

  1. Naming. Proposes business-domain names for generated Lambdas, queues, state machines, IAM roles. The proposal must pass: uniqueness in the namespace, length / regex / reserved-word checks, no PII in the name.
  2. Comments. Writes the human-readable line above each generated state ("// Charges a single line item against billing service"). No structural impact — comments are stripped before semantic comparison.
  3. Test fixtures. Generates boundary-condition input samples. Each sample must conform to the source schema before being added to the replay set.
  4. Long-tail body proposals. When an embedded code activity references a vendor-internal class the rule library doesn't know (e.g. com.tibco.pe.plugin.*), the LLM proposes one replacement. The replacement must compile, must accept the source input schema, must return the source output schema, and must produce identical output for the replay set. One shot. If it fails the gate, the activity is flagged for human review — no iteration loop on a failing contract.
  5. Reviewer narrative. Drafts the cover memo for the human reviewer (what migrated cleanly, what was flagged, what to look at first). No structural impact.
LLM NEVER TOUCHES

What the rule library owns

  • Activity → service mapping. A BW JMS-XA receiver always becomes Amazon MQ. A BW JMS non-transactional receiver always becomes SQS. The choice is in the rule, not in a prompt.
  • Control flow. Iterate, Parallel, Choice, Catch/Retry — all translated by rules with explicit AST-to-AST mappings.
  • IaC. Every CDK / Terraform line comes from a template parameterized by the parse tree.
  • Schema translation. XSD → JSON Schema is a published rule set; we do not let an LLM "interpret" a type.
  • XPath → JSONata / JSONPath. Covered by a rule library for the XPath 1.0 subset. Anything outside the subset is flagged, not silently guessed.
  • Lineage. Every output line has a deterministic back-pointer to a source line + a rule ID + a rule version. The LLM cannot fabricate this.
The contract gate (every LLM output passes here)
LLM proposes
Compiles?
Input schema matches?
Output schema matches?
Replay equivalence?
Accepted

A failure at any step rejects the proposal. The LLM does not get a retry loop. If the rule library can't pick it up either, the artifact is flagged for human review and the evidence pack records the gap.

Linear process — order ingestion

JMS-XA receive → XML parse → JDBC lookup → JMS-XA publish. The simplest control flow; the most important rule decision (XA boundary).

SOURCE OrderIngest.bwp
 1 <?xml version="1.0" encoding="UTF-8"?>
 2 <bw:process name="OrderIngest"
 3             xmlns:bw="http://xsd.tns.tibco.com/amf/models/sharedresource/bw">
 4
 5   <bw:starter name="ReceiveOrder" type="bw.jms.JMSQueueReceiver">
 6     <bw:property name="ConnectionFactory">JNDI:QCF.Orders</bw:property>
 7     <bw:property name="QueueName">ORDERS.IN</bw:property>
 8     <bw:property name="TransactionMode">XA</bw:property>
 9   </bw:starter>
10
11   <bw:activity name="ParseOrder" type="bw.xml.ParseXML">
12     <bw:property name="Schema">/Schemas/Order.xsd</bw:property>
13   </bw:activity>
14
15   <bw:activity name="LookupCustomer" type="bw.jdbc.JDBCQuery">
16     <bw:property name="DataSource">JNDI:DS.CustomerDB</bw:property>
17     <bw:property name="SQL">SELECT id, tier, region FROM customers WHERE id = ?</bw:property>
18   </bw:activity>
19
20   <bw:activity name="PublishEnriched" type="bw.jms.JMSQueueSender">
21     <bw:property name="ConnectionFactory">JNDI:QCF.Orders</bw:property>
22     <bw:property name="QueueName">ORDERS.ENRICHED</bw:property>
23     <bw:property name="DeliveryMode">PERSISTENT</bw:property>
24   </bw:activity>
25
26   <bw:transitions>
27     <bw:transition from="ReceiveOrder" to="ParseOrder"/>
28     <bw:transition from="ParseOrder" to="LookupCustomer"/>
29     <bw:transition from="LookupCustomer" to="PublishEnriched"/>
30   </bw:transitions>
31 </bw:process>
TARGET infra/state-machines/order-ingest.asl.json
 1 {
 2   "Comment": "Generated from OrderIngest.bwp · rule-set 4.7 · src-hash a3f9b2c1",
 3   "StartAt": "ReceiveOrder",
 4   "QueryLanguage": "JSONata",
 5   "States": {
 6     "ReceiveOrder": {
 7       "Type": "Task",
 8       // src: OrderIngest.bwp#L5  rule: JMS_RECEIVE_XA_TO_AMAZON_MQ@2.3
 9       "Resource": "arn:aws:states:::aws-sdk:mq:receiveMessage",
10       "Arguments": {
11         "BrokerId": "{% $broker.orders %}",
12         "QueueName": "ORDERS.IN",
13         "WaitTimeSeconds": 20
14       },
15       "Next": "ParseOrder"
16     },
17     "ParseOrder": {
18       "Type": "Task",
19       // src: OrderIngest.bwp#L11  rule: XML_VALIDATE_TO_LAMBDA@1.4
20       "Resource": "arn:aws:states:::lambda:invoke",
21       "Arguments": {
22         "FunctionName": "validate-order-schema",
23         "Payload": { "schema": "Order.xsd", "xml": "{% $states.input.Messages[0].Body %}" }
24       },
25       "ResultPath": "$.order",
26       "Next": "LookupCustomer"
27     },
28     "LookupCustomer": {
29       "Type": "Task",
30       // src: OrderIngest.bwp#L15  rule: JDBC_PARAM_TO_RDS_DATA_API@3.2
31       "Resource": "arn:aws:states:::aws-sdk:rdsdata:executeStatement",
32       "Arguments": {
33         "ResourceArn": "{% $env.CUSTOMER_DB_ARN %}",
34         "SecretArn": "{% $env.CUSTOMER_DB_SECRET %}",
35         "Database": "customers",
36         "Sql": "SELECT id, tier, region FROM customers WHERE id = :id",
37         "Parameters": [{ "name": "id", "value": { "stringValue": "{% $order.customerId %}" }}]
38       },
39       "ResultPath": "$.customer",
40       "Next": "PublishEnriched"
41     },
42     "PublishEnriched": {
43       "Type": "Task",
44       // src: OrderIngest.bwp#L20  rule: JMS_SEND_TO_AMAZON_MQ@2.3
45       "Resource": "arn:aws:states:::aws-sdk:mq:sendMessage",
46       "Arguments": {
47         "BrokerId": "{% $broker.orders %}",
48         "QueueName": "ORDERS.ENRICHED",
49         "DeliveryMode": "PERSISTENT",
50         "MessageBody": "{% $order ~> |$| {'customer': $customer.records[0]} | %}"
51       },
52       "End": true
53     }
54   }
55 }

Lineage

Source (line)Rule (id @ version)Target (line)
OrderIngest.bwp#L5<bw:starter ReceiveOrder> + TransactionMode=XAJMS_RECEIVE_XA_TO_AMAZON_MQ@2.3order-ingest.asl.json#L6States.ReceiveOrder
#L11<bw:activity ParseOrder type=bw.xml.ParseXML>XML_VALIDATE_TO_LAMBDA@1.4#L17States.ParseOrder + lambda/validate-order-schema/
#L15<bw:activity LookupCustomer type=bw.jdbc.JDBCQuery>JDBC_PARAM_TO_RDS_DATA_API@3.2#L28States.LookupCustomer
#L20<bw:activity PublishEnriched type=bw.jms.JMSQueueSender>JMS_SEND_TO_AMAZON_MQ@2.3#L42States.PublishEnriched
#L27-29<bw:transitions>LINEAR_TRANSITION_TO_NEXT@1.0States.*.Next chain
Rule decision worth noticing. TransactionMode=XA on the receiver routed the migration to Amazon MQ (ActiveMQ engine — supports XA / JMS 2.0) instead of SQS (which does not). The rule body literally reads if bw.jms.* and TransactionMode in {XA, LOCAL_TRANSACTED}: target = AmazonMQ else: target = SQS. The choice is a property of the rule, published and versioned. The LLM did not pick this. Change TransactionMode to NONE and re-run; the rule lookup changes; the target changes; the lineage row records the swap. That's what "deterministic" means in operational terms.

For Each loop — billing dispatch (the one you asked for)

A bw.generalactivities.Iterate group, sequential mode, with a nested Choice and a Sleep-based retry. Real-world control-flow density.

SOURCE BillingDispatch.bwp
 1 <bw:process name="BillingDispatch" xmlns:bw="...">
 2
 3   <bw:starter name="ReceiveInvoice" type="bw.jms.JMSQueueReceiver">
 4     <bw:property name="QueueName">INVOICE.READY</bw:property>
 5   </bw:starter>
 6
 7   <bw:group name="ForEachLineItem" type="bw.generalactivities.Iterate">
 8     <bw:property name="IterationVariable">lineItem</bw:property>
 9     <bw:property name="OverExpression">$Invoice/LineItems/Item</bw:property>
10     <bw:property name="Sequence">true</bw:property>
11     <bw:property name="IndexSlot">$_lineIdx</bw:property>
12     <bw:property name="AccumulateOutput">true</bw:property>
13
14     <bw:activity name="ChargeAccount" type="bw.http.SendHTTPRequest">
15       <bw:property name="Url">https://billing.svc/charge</bw:property>
16       <bw:property name="Method">POST</bw:property>
17       <bw:input>
18         <body>
19           <sku>$lineItem/sku</sku>
20           <qty>$lineItem/qty</qty>
21           <unitPrice>$lineItem/unitPrice</unitPrice>
22         </body>
23       </bw:input>
24     </bw:activity>
25
26     <bw:activity name="CheckChargeResult" type="bw.generalactivities.Choice">
27       <bw:branch condition="$ChargeAccount/statusCode = 200" to="LogSuccess"/>
28       <bw:branch condition="$ChargeAccount/statusCode >= 500" to="RetryAfterDelay"/>
29       <bw:branch condition="true" to="LogFailure"/>
30     </bw:activity>
31
32     <bw:activity name="RetryAfterDelay" type="bw.generalactivities.Sleep">
33       <bw:property name="DurationMs">2000</bw:property>
34     </bw:activity>
35
36     <bw:activity name="LogSuccess" type="bw.generalactivities.WriteToLog">
37       <bw:property name="Level">INFO</bw:property>
38       <bw:property name="Message">Charged $lineItem/sku for $lineItem/qty units</bw:property>
39     </bw:activity>
40
41     <bw:activity name="LogFailure" type="bw.generalactivities.WriteToLog">
42       <bw:property name="Level">ERROR</bw:property>
43       <bw:property name="Message">Failed to charge $lineItem/sku, status $ChargeAccount/statusCode</bw:property>
44     </bw:activity>
45
46   </bw:group>
47
48   <bw:activity name="PublishSummary" type="bw.jms.JMSQueueSender">
49     <bw:property name="QueueName">INVOICE.DISPATCHED</bw:property>
50   </bw:activity>
51 </bw:process>
TARGET infra/state-machines/billing-dispatch.asl.json
 1 {
 2   "Comment": "Generated from BillingDispatch.bwp · rule-set 4.7",
 3   "StartAt": "ReceiveInvoice",
 4   "QueryLanguage": "JSONata",
 5   "States": {
 6     "ReceiveInvoice": {
 7       "Type": "Task",
 8       // src: BillingDispatch.bwp#L3  rule: JMS_RECEIVE_TO_SQS@2.1
 9       "Resource": "arn:aws:states:::aws-sdk:sqs:receiveMessage",
10       "Arguments": { "QueueUrl": "{% $queues.invoice_ready %}" },
11       "ResultPath": "$.invoice",
12       "Next": "ForEachLineItem"
13     },
14     "ForEachLineItem": {
15       "Type": "Map",
16       // src: BillingDispatch.bwp#L7  rule: ITERATE_SEQ_TO_STEPFN_MAP@2.0
17       "ItemsPath": "$.invoice.lineItems",
18       "MaxConcurrency": 1,   // from Sequence=true
19       "ItemSelector": {
20         "lineItem.$": "$$.Map.Item.Value",
21         "lineIdx.$":  "$$.Map.Item.Index"   // from IndexSlot
22       },
23       "ItemProcessor": {
24         "ProcessorConfig": { "Mode": "INLINE" },
25         "StartAt": "ChargeAccount",
26         "States": {
27           "ChargeAccount": {
28             "Type": "Task",
29             // src: BillingDispatch.bwp#L14  rule: HTTP_TO_STEPFN_HTTP@1.2
30             "Resource": "arn:aws:states:::http:invoke",
31             "Arguments": {
32               "ApiEndpoint": "https://billing.svc/charge",
33               "Method": "POST",
34               "RequestBody": {
35                 "sku":       "{% $lineItem.sku %}",
36                 "qty":       "{% $lineItem.qty %}",
37                 "unitPrice": "{% $lineItem.unitPrice %}"
38               },
39               "Authentication": { "ConnectionArn": "{% $env.BILLING_CONN %}" }
40             },
41             "ResultPath": "$.charge",
42             "Next": "CheckChargeResult"
43           },
44           "CheckChargeResult": {
45             "Type": "Choice",
46             // src: BillingDispatch.bwp#L26  rule: CHOICE_TO_STEPFN_CHOICE@1.5
47             "Choices": [
48               { "Variable": "$.charge.StatusCode", "NumericEquals": 200, "Next": "LogSuccess" },
49               { "Variable": "$.charge.StatusCode", "NumericGreaterThanEquals": 500, "Next": "RetryAfterDelay" }
49             ],
50             "Default": "LogFailure"
51           },
52           "RetryAfterDelay": {
53             "Type": "Wait",
54             // src: BillingDispatch.bwp#L32  rule: SLEEP_TO_STEPFN_WAIT@1.0
55             "Seconds": 2,
56             "Next": "ChargeAccount"   // retry loop preserved
57           },
58           "LogSuccess": {
59             "Type": "Pass",
60             // src: BillingDispatch.bwp#L36  rule: LOG_TO_PASS_LOG@1.0
61             "Parameters": {
62               "level": "INFO",
63               "msg.$": "States.Format('Charged {} for {} units', $.lineItem.sku, $.lineItem.qty)"
64             },
65             "End": true
66           },
67           "LogFailure": {
68             "Type": "Pass",
69             "Parameters": {
70               "level": "ERROR",
71               "msg.$": "States.Format('Failed {}, status {}', $.lineItem.sku, $.charge.StatusCode)"
72             },
73             "End": true
74           }
75         }
76       },
77       "ResultPath": "$.charges",
78       "Next": "PublishSummary"
79     },
80     "PublishSummary": { "Type": "Task", /* ... */ "End": true }
81   }
82 }

Lineage — the loop, the choice, the retry

Source constructRuleTarget construct
<bw:group type="bw.generalactivities.Iterate"> at L7ITERATE_SEQ_TO_STEPFN_MAP@2.0Type: "Map" at L15
Sequence=true at L10(rule sub-rule)MaxConcurrency: 1 at L18
IterationVariable=lineItem at L8(rule sub-rule)ItemSelector.lineItem at L20
OverExpression=$Invoice/LineItems/Item at L9XPATH_TO_JSONPATH@1.8ItemsPath: "$.invoice.lineItems" at L17
IndexSlot=$_lineIdx at L11(rule sub-rule)ItemSelector.lineIdx at L21
AccumulateOutput=true at L12(rule sub-rule)ResultPath: "$.charges" at L77
<bw:activity ChargeAccount> (HTTP POST) at L14HTTP_TO_STEPFN_HTTP@1.2States.ChargeAccount at L27
<bw:activity CheckChargeResult> 3-way Choice at L26CHOICE_TO_STEPFN_CHOICE@1.5Choices[] + Default at L44
<bw:activity RetryAfterDelay type=Sleep DurationMs=2000> at L32SLEEP_TO_STEPFN_WAIT@1.0Type: "Wait", Seconds: 2 at L52
Branch L28 to="RetryAfterDelay" → implicit continueRETRY_LOOPBACK@1.0RetryAfterDelay.Next: "ChargeAccount" at L56
WriteToLog at L36, L41LOG_TO_PASS_LOG@1.0States.LogSuccess, States.LogFailure
Three things to notice on the loop.
  1. Sequence=true isn't translated to a comment — it's translated to a functional constraint, MaxConcurrency: 1. If you flip it to false, the rule emits a Map with no concurrency cap and the lineage row changes from ITERATE_SEQ_TO_STEPFN_MAP to ITERATE_PAR_TO_STEPFN_MAP. Same node, different rule.
  2. The retry topology (Sleep → loop back to ChargeAccount) is preserved structurally. The rule writes RetryAfterDelay.Next = "ChargeAccount", not a comment saying "this should retry." Replay it: the 5xx path actually loops.
  3. AccumulateOutput=true would be easy to silently drop — that's exactly the kind of detail an LLM-led migration misses. The rule preserves it as ResultPath: $.charges, so the Map's accumulated array survives. Lineage row catches it.

Parallel group — three concurrent checks

Credit check + fraud check + KYC, fan-out then join. Tests that the engine preserves concurrency semantics.

SOURCEOnboardCustomer.bwp (excerpt)
 1 <bw:group name="ParallelEnrich" type="bw.generalactivities.Parallel">
 2   <bw:branch>
 3     <bw:activity name="CreditCheck" type="bw.http.SendHTTPRequest">
 4       <bw:property name="Url">https://credit.svc/check</bw:property>
 5     </bw:activity>
 6   </bw:branch>
 7   <bw:branch>
 8     <bw:activity name="FraudCheck" type="bw.http.SendHTTPRequest">
 9       <bw:property name="Url">https://fraud.svc/score</bw:property>
10     </bw:activity>
11   </bw:branch>
12   <bw:branch>
13     <bw:activity name="KYCCheck" type="bw.jdbc.JDBCQuery">
14       <bw:property name="SQL">SELECT status FROM kyc WHERE customer_id=?</bw:property>
15     </bw:activity>
16   </bw:branch>
17 </bw:group>
TARGETonboard-customer.asl.json (excerpt)
 1 "ParallelEnrich": {
 2   "Type": "Parallel",
 3   // src: OnboardCustomer.bwp#L1  rule: PARALLEL_GROUP_TO_STEPFN_PARALLEL@1.4
 4   "Branches": [
 5     {
 6       "StartAt": "CreditCheck",
 7       "States": { "CreditCheck": {
 8         "Type": "Task", "Resource": "arn:aws:states:::http:invoke",
 9         "Arguments": { "ApiEndpoint": "https://credit.svc/check", "Method": "POST" },
10         "End": true }}
11     },
12     { "StartAt": "FraudCheck", "States": { /* ... */ }},
13     { "StartAt": "KYCCheck",   "States": { /* ... */ }}
14   ],
15   "ResultSelector": {
16     "credit.$": "$[0]", "fraud.$": "$[1]", "kyc.$": "$[2]"
17   },
18   "Next": "DecideEligibility"
19 }
Why this matters. BW's bw.generalactivities.Parallel group joins implicitly when all branches finish. Step Functions Parallel has the same semantic. ResultSelector projects each branch's output to a named slot — the rule preserves the BW branch index order so downstream DecideEligibility sees credit, fraud, kyc exactly as the original did.

XSLT mapper → JSONata

A BW Mapper activity body, including xsl:for-each, xsl:choose, and an aggregate expression.

SOURCEOrderToShipment.xsl
 1 <xsl:template match="/Order">
 2   <Shipment>
 3     <orderId><xsl:value-of select="@id"/></orderId>
 4     <customer>
 5       <name><xsl:value-of select="Customer/Name"/></name>
 6       <tier><xsl:value-of select="Customer/Tier"/></tier>
 7     </customer>
 8     <items>
 9       <xsl:for-each select="LineItems/Item">
10         <item>
11           <sku><xsl:value-of select="@sku"/></sku>
12           <weight><xsl:value-of select="Weight * Quantity"/></weight>
13         </item>
14       </xsl:for-each>
15     </items>
16     <totalWeight>
17       <xsl:value-of select="sum(LineItems/Item/Weight * LineItems/Item/Quantity)"/>
18     </totalWeight>
19     <priority>
20       <xsl:choose>
21         <xsl:when test="Customer/Tier = 'GOLD'">EXPRESS</xsl:when>
22         <xsl:otherwise>STANDARD</xsl:otherwise>
23       </xsl:choose>
24     </priority>
25   </Shipment>
26 </xsl:template>
TARGETmappers/order-to-shipment.jsonata
 1 /* Generated from OrderToShipment.xsl · rule: XSLT_TO_JSONATA@2.1 */
 2 {
 3   "orderId": $.order.id,
 4   "customer": {
 5     "name": $.order.customer.name,
 6     "tier": $.order.customer.tier
 7   },
 8   "items": $.order.lineItems.{
 9     "sku": sku,
10     "weight": weight * quantity
11   },
12   "totalWeight": $sum($.order.lineItems.(weight * quantity)),
13   "priority": $.order.customer.tier = "GOLD" ? "EXPRESS" : "STANDARD"
14 }

XPath / XSLT construct → JSONata rule table

XSLT constructJSONata equivalentRule
<xsl:value-of select="@id"/> (L3)$.order.idXPATH_ATTR_TO_JSONATA@1.4
<xsl:value-of select="Customer/Name"/> (L5)$.order.customer.nameXPATH_TO_JSONATA@1.4
<xsl:for-each select="LineItems/Item"> (L9)$.order.lineItems.{...}FOREACH_TO_JSONATA_MAP@1.2
Weight * Quantity (L12)weight * quantityXPATH_ARITH_TO_JSONATA@1.0
sum(... Weight * Quantity) (L17)$sum($.order.lineItems.(weight * quantity))XPATH_SUM_TO_JSONATA@1.0
<xsl:choose> / <xsl:when> (L20-23)tier = "GOLD" ? "EXPRESS" : "STANDARD"XSLT_CHOOSE_TO_JSONATA_TERNARY@1.1
Honest about the long tail. The XSLT→JSONata rule library covers the XPath 1.0 subset that we've measured to cover ~80% of real-world BW mappings (axis predicates, arithmetic, common string functions, sum/count/avg, choose, for-each). What is not in the rule library — XSLT keys, doc(), document(), custom XSLT extensions, key-based joins — is flagged, not silently translated. The flag appears in the verification report as HUMAN_REVIEW_REQUIRED with the file and line, and the migration is not marked complete until a reviewer signs off. We don't ask the LLM to "be creative" with XSLT extensions; we ask a human.

Java code activity → Lambda

The case people assume needs an LLM. It doesn't — for the common case the rule preserves the Java verbatim. The LLM only enters when the body references a BW-internal class.

SOURCEOrderIngest.bwp (Java activity body)
 1 <bw:activity name="CalculateChecksum" type="bw.generalactivities.JavaCode">
 2   <bw:input>
 3     <payload xsi:type="xsd:string"/>
 4   </bw:input>
 5   <bw:output>
 6     <checksum xsi:type="xsd:string"/>
 7   </bw:output>
 8   <bw:javacode>
 9     import java.security.MessageDigest;
10     import java.nio.charset.StandardCharsets;
11
12     public String execute(String payload) throws Exception {
13       MessageDigest md = MessageDigest.getInstance("SHA-256");
14       byte[] hash = md.digest(payload.getBytes(StandardCharsets.UTF_8));
15       StringBuilder sb = new StringBuilder();
16       for (byte b : hash) {
17         sb.append(String.format("%02x", b));
18       }
19       return sb.toString();
20     }
21   </bw:javacode>
22 </bw:activity>
TARGETlambda/calculate-checksum/Handler.java
 1 // Generated from CalculateChecksum activity · rule: JAVACODE_TO_LAMBDA@2.0
 2 // Body preserved verbatim · src-body-hash: d1e2f3a4...
 3 package com.cognidev.generated;
 4
 5 import com.amazonaws.services.lambda.runtime.Context;
 6 import com.amazonaws.services.lambda.runtime.RequestHandler;
 7 import java.security.MessageDigest;
 8 import java.nio.charset.StandardCharsets;
 9 import java.util.Map;
10 import java.util.HashMap;
11
12 public class Handler implements RequestHandler<Map<String,String>, Map<String,String>> {
13
14   // Contract gate: input key 'payload' matches bw:input/payload@xsd:string
15   public Map<String,String> handleRequest(Map<String,String> in, Context ctx) {
16     try {
17       String checksum = execute(in.get("payload"));
18       Map<String,String> out = new HashMap<>();
19       out.put("checksum", checksum);   // matches bw:output/checksum@xsd:string
20       return out;
21     } catch (Exception e) { throw new RuntimeException(e); }
22   }
23
24   // === BW body — preserved verbatim ===
25   public String execute(String payload) throws Exception {
26     MessageDigest md = MessageDigest.getInstance("SHA-256");
27     byte[] hash = md.digest(payload.getBytes(StandardCharsets.UTF_8));
28     StringBuilder sb = new StringBuilder();
29     for (byte b : hash) {
30       sb.append(String.format("%02x", b));
31     }
32     return sb.toString();
33   }
34 }
The LLM did not write any of this. The rule extracts the Java body, wraps it in a Lambda handler with the BW input/output contract enforced, and ships it. The for (byte b : hash) loop is preserved byte-for-byte. The hash of the source body is recorded in the evidence pack so an auditor can prove no rewrite happened. The exception — and the place the LLM does earn its keep — is when the BW Java references a vendor-internal class like com.tibco.pe.plugin.SubprocessInvoker. The rule library has no Lambda equivalent. The LLM gets one shot at proposing a replacement; the proposal must compile, match the input/output schema, and produce identical output on the replay set. If it fails any of those, the activity is flagged for human review. There's no "the LLM tries again with more context" loop on a failing contract.

Schema equivalence — how we prove semantic match

Every migration emits a schema-diff.json that compares the source XSD field-by-field against the target JSON Schema. This is the artifact a customer's architect reads first.

 1 {
 2   "schema": "Order",
 3   "source": { "file": "/Schemas/Order.xsd", "hash": "7e2d…" },
 4   "target": { "file": "/schemas/order.schema.json", "hash": "a91f…" },
 5   "fields": [
 6     { "path": "Order/@id",             "src": "xsd:string",   "tgt": "string",            "status": "MATCH" },
 7     { "path": "Order/Customer/Name",   "src": "xsd:string",   "tgt": "string",            "status": "MATCH" },
 8     { "path": "Order/Customer/Tier",   "src": "xsd:string",   "tgt": "string",            "status": "MATCH" },
 9     { "path": "Order/LineItems/Item[]/@sku",    "src": "xsd:string", "tgt": "string", "status": "MATCH" },
10     { "path": "Order/LineItems/Item[]/Weight",   "src": "xsd:decimal", "tgt": "number", "status": "MATCH" },
11     { "path": "Order/LineItems/Item[]/Quantity", "src": "xsd:int",     "tgt": "integer", "status": "MATCH" },
12     { "path": "Order/@createdAt",        "src": "xsd:dateTime", "tgt": "string (date-time)", "status": "MATCH_WITH_FORMAT_HINT" }
13   ],
14   "summary": { "match": 7, "format_hint": 1, "mismatch": 0, "unmapped": 0 },
15   "replay": { "samples": 12, "passed": 12, "failed": 0, "divergence_fields": [] }
16 }

MATCH_WITH_FORMAT_HINT means the values are equivalent but expressed differently (XSD xsd:dateTime → JSON Schema string with format: date-time). The replay block proves it with sample data: 12 input messages from the source side were run through both the original BW process and the generated Step Function, with field-by-field diffs of the resulting outputs. Zero divergence means the migration is semantically equivalent for the sample set; the same harness keeps running in CI after deploy.

The evidence pack — what an auditor receives

Every migration ships a signed evidence bundle. An auditor opens it and answers their own questions; no presentation needed.

  • source-tree.json — every source file with hash, size, and parse status
  • dependency-graph.json — the full DAG with all transitions and shared resources
  • rules-applied.csv — one row per node: source_file,source_line,node_type,rule_id,rule_version,target_file,target_line
  • schema-diff.json — field-by-field XSD ↔ JSON Schema match
  • replay-results.json — sample-message replay with field-level divergence (zero or otherwise)
  • llm-proposals.jsonl — every LLM call: prompt, response, contract-gate verdict (accepted / rejected with reason). Empty if the project had no long-tail bodies.
  • human-review.md — activities flagged for human review with file, line, reason
  • rule-library-manifest.json — every rule used, with version, hash, and a link to its source in the published library
  • signature.sig — signature of the bundle so downstream consumers can verify integrity

A reviewer can pick any row in rules-applied.csv, open the cited rule in the manifest, open the source and target files at the cited lines, and verify with their own eyes that the rule did what it claims. That's what we mean by audit-grade.

What we don't claim

A document going to a customer is worthless if it overclaims. Here is what this engine is not:

  • Not 100% automated for every BW project. Custom XSLT extensions, vendor-internal Java classes, exotic adapter configs, and rarely-used activities are flagged for human review. The flag itself is part of the evidence pack — you see it before you ship.
  • Not "drop-in zero-effort." Customers still wire up IAM, VPCs, secrets, observability, and any business-side decisions that the BW project didn't make explicit (e.g. DLQ retention, max-receive count). The engine emits sensible defaults; you tune them.
  • Not a behavioural-equivalence proof for every input. Replay equivalence is over the sample set, which we make as wide as we can reach. Production-grade testing for safety-critical workloads still needs a domain-specific test plan.
  • Not a guarantee that AWS will price exactly like TIBCO. Cost shape changes — Amazon MQ vs EMS, Lambda vs persistent JVM, Step Functions Express vs Standard — we document the trade-offs; you choose.
  • The LLM is bounded, not absent. It writes names, comments, fixtures, narratives, and proposes long-tail replacements behind a contract gate. We claim "deterministic structure with bounded synthesis," not "no LLM in the building."

If any of these matter, ask. If your project is the kind where they don't, the engine ships a complete, reviewable migration with audit-grade evidence — and the LLM stays where it belongs.