How CogniDev migrates a TIBCO project to AWS — and how you verify every line of it.
This document walks through five concrete file pairs from a representative TIBCO BusinessWorks project (the source) and the AWS artifacts CogniDev emits (the target). For each pair, you see the exact source construct, the deterministic rule that fired, the target artifact, and a line-level lineage proof. The goal is to make every claim on cognidev.ai testable on real files.
The deterministic cycle
Every migration runs through six phases. Each phase emits a signed artifact that the next phase reads. Nothing is implicit; everything is on disk and re-runnable.
-
01Parse
Read every
.bwp(process),.process,.substvar(substitution variables),.xpdl,.wsdl,.xsd, adapter config, JMS destination, JDBC binding. Build a typed AST per file. Output:parse-tree.json. -
02Graph
Resolve transitions, sub-process calls, JMS topic flows, shared resources. Build the project-wide DAG. Output:
dependency-graph.jsonwith every activity as a node and every transition / message flow as an edge. -
03Classify
Walk the DAG and tag idiomatic shapes (linear, iterate, parallel, choice, request-reply, file-poll, claim-check, dead-letter). Tag transactional boundaries (XA, local). Output:
classification.json. -
04Map
For every node and every shape, look up the rule in the published rule library. Each rule has an ID, a version, an input schema (which BW node it accepts) and an output schema (which AWS artifact it produces). Output:
rules-applied.csv— one row per node. -
05Emit
Each rule emits its target artifact: Step Functions ASL, Lambda handler stubs, CDK / Terraform IaC, IAM policies, Amazon MQ / SQS / SNS / MSK configs, RDS schemas, JSONata mappers. Output: a complete AWS project tree.
-
06Verify
Schema-in / schema-out equivalence between source and target. Sample-message replay where source endpoints are reachable. Static analysis on the emitted code. Output:
verification-report.json+ the signed evidence pack.
The LLM boundary
The LLM is not a translator. It is a writer with a strictly bounded scope, and every output it produces is checked by a deterministic gate before it is allowed into a generated artifact. This is the part of the system that lets us claim "deterministic" honestly.
Five bounded roles
- Naming. Proposes business-domain names for generated Lambdas, queues, state machines, IAM roles. The proposal must pass: uniqueness in the namespace, length / regex / reserved-word checks, no PII in the name.
- Comments. Writes the human-readable line above each generated state ("// Charges a single line item against billing service"). No structural impact — comments are stripped before semantic comparison.
- Test fixtures. Generates boundary-condition input samples. Each sample must conform to the source schema before being added to the replay set.
-
Long-tail body proposals. When an embedded code activity references a vendor-internal class the rule library doesn't know (e.g.
com.tibco.pe.plugin.*), the LLM proposes one replacement. The replacement must compile, must accept the source input schema, must return the source output schema, and must produce identical output for the replay set. One shot. If it fails the gate, the activity is flagged for human review — no iteration loop on a failing contract. - Reviewer narrative. Drafts the cover memo for the human reviewer (what migrated cleanly, what was flagged, what to look at first). No structural impact.
What the rule library owns
- Activity → service mapping. A BW JMS-XA receiver always becomes Amazon MQ. A BW JMS non-transactional receiver always becomes SQS. The choice is in the rule, not in a prompt.
- Control flow. Iterate, Parallel, Choice, Catch/Retry — all translated by rules with explicit AST-to-AST mappings.
- IaC. Every CDK / Terraform line comes from a template parameterized by the parse tree.
- Schema translation. XSD → JSON Schema is a published rule set; we do not let an LLM "interpret" a type.
- XPath → JSONata / JSONPath. Covered by a rule library for the XPath 1.0 subset. Anything outside the subset is flagged, not silently guessed.
- Lineage. Every output line has a deterministic back-pointer to a source line + a rule ID + a rule version. The LLM cannot fabricate this.
A failure at any step rejects the proposal. The LLM does not get a retry loop. If the rule library can't pick it up either, the artifact is flagged for human review and the evidence pack records the gap.
Linear process — order ingestion
JMS-XA receive → XML parse → JDBC lookup → JMS-XA publish. The simplest control flow; the most important rule decision (XA boundary).
OrderIngest.bwp
1 <?xml version="1.0" encoding="UTF-8"?> 2 <bw:process name="OrderIngest" 3 xmlns:bw="http://xsd.tns.tibco.com/amf/models/sharedresource/bw"> 4 5 <bw:starter name="ReceiveOrder" type="bw.jms.JMSQueueReceiver"> 6 <bw:property name="ConnectionFactory">JNDI:QCF.Orders</bw:property> 7 <bw:property name="QueueName">ORDERS.IN</bw:property> 8 <bw:property name="TransactionMode">XA</bw:property> 9 </bw:starter> 10 11 <bw:activity name="ParseOrder" type="bw.xml.ParseXML"> 12 <bw:property name="Schema">/Schemas/Order.xsd</bw:property> 13 </bw:activity> 14 15 <bw:activity name="LookupCustomer" type="bw.jdbc.JDBCQuery"> 16 <bw:property name="DataSource">JNDI:DS.CustomerDB</bw:property> 17 <bw:property name="SQL">SELECT id, tier, region FROM customers WHERE id = ?</bw:property> 18 </bw:activity> 19 20 <bw:activity name="PublishEnriched" type="bw.jms.JMSQueueSender"> 21 <bw:property name="ConnectionFactory">JNDI:QCF.Orders</bw:property> 22 <bw:property name="QueueName">ORDERS.ENRICHED</bw:property> 23 <bw:property name="DeliveryMode">PERSISTENT</bw:property> 24 </bw:activity> 25 26 <bw:transitions> 27 <bw:transition from="ReceiveOrder" to="ParseOrder"/> 28 <bw:transition from="ParseOrder" to="LookupCustomer"/> 29 <bw:transition from="LookupCustomer" to="PublishEnriched"/> 30 </bw:transitions> 31 </bw:process>
infra/state-machines/order-ingest.asl.json
1 { 2 "Comment": "Generated from OrderIngest.bwp · rule-set 4.7 · src-hash a3f9b2c1", 3 "StartAt": "ReceiveOrder", 4 "QueryLanguage": "JSONata", 5 "States": { 6 "ReceiveOrder": { 7 "Type": "Task", 8 // src: OrderIngest.bwp#L5 rule: JMS_RECEIVE_XA_TO_AMAZON_MQ@2.3 9 "Resource": "arn:aws:states:::aws-sdk:mq:receiveMessage", 10 "Arguments": { 11 "BrokerId": "{% $broker.orders %}", 12 "QueueName": "ORDERS.IN", 13 "WaitTimeSeconds": 20 14 }, 15 "Next": "ParseOrder" 16 }, 17 "ParseOrder": { 18 "Type": "Task", 19 // src: OrderIngest.bwp#L11 rule: XML_VALIDATE_TO_LAMBDA@1.4 20 "Resource": "arn:aws:states:::lambda:invoke", 21 "Arguments": { 22 "FunctionName": "validate-order-schema", 23 "Payload": { "schema": "Order.xsd", "xml": "{% $states.input.Messages[0].Body %}" } 24 }, 25 "ResultPath": "$.order", 26 "Next": "LookupCustomer" 27 }, 28 "LookupCustomer": { 29 "Type": "Task", 30 // src: OrderIngest.bwp#L15 rule: JDBC_PARAM_TO_RDS_DATA_API@3.2 31 "Resource": "arn:aws:states:::aws-sdk:rdsdata:executeStatement", 32 "Arguments": { 33 "ResourceArn": "{% $env.CUSTOMER_DB_ARN %}", 34 "SecretArn": "{% $env.CUSTOMER_DB_SECRET %}", 35 "Database": "customers", 36 "Sql": "SELECT id, tier, region FROM customers WHERE id = :id", 37 "Parameters": [{ "name": "id", "value": { "stringValue": "{% $order.customerId %}" }}] 38 }, 39 "ResultPath": "$.customer", 40 "Next": "PublishEnriched" 41 }, 42 "PublishEnriched": { 43 "Type": "Task", 44 // src: OrderIngest.bwp#L20 rule: JMS_SEND_TO_AMAZON_MQ@2.3 45 "Resource": "arn:aws:states:::aws-sdk:mq:sendMessage", 46 "Arguments": { 47 "BrokerId": "{% $broker.orders %}", 48 "QueueName": "ORDERS.ENRICHED", 49 "DeliveryMode": "PERSISTENT", 50 "MessageBody": "{% $order ~> |$| {'customer': $customer.records[0]} | %}" 51 }, 52 "End": true 53 } 54 } 55 }
Lineage
| Source (line) | Rule (id @ version) | Target (line) |
|---|---|---|
OrderIngest.bwp#L5 — <bw:starter ReceiveOrder> + TransactionMode=XA | JMS_RECEIVE_XA_TO_AMAZON_MQ@2.3 | order-ingest.asl.json#L6 — States.ReceiveOrder |
#L11 — <bw:activity ParseOrder type=bw.xml.ParseXML> | XML_VALIDATE_TO_LAMBDA@1.4 | #L17 — States.ParseOrder + lambda/validate-order-schema/ |
#L15 — <bw:activity LookupCustomer type=bw.jdbc.JDBCQuery> | JDBC_PARAM_TO_RDS_DATA_API@3.2 | #L28 — States.LookupCustomer |
#L20 — <bw:activity PublishEnriched type=bw.jms.JMSQueueSender> | JMS_SEND_TO_AMAZON_MQ@2.3 | #L42 — States.PublishEnriched |
#L27-29 — <bw:transitions> | LINEAR_TRANSITION_TO_NEXT@1.0 | States.*.Next chain |
TransactionMode=XA on the receiver routed the migration to Amazon MQ (ActiveMQ engine — supports XA / JMS 2.0) instead of SQS (which does not). The rule body literally reads if bw.jms.* and TransactionMode in {XA, LOCAL_TRANSACTED}: target = AmazonMQ else: target = SQS. The choice is a property of the rule, published and versioned. The LLM did not pick this. Change TransactionMode to NONE and re-run; the rule lookup changes; the target changes; the lineage row records the swap. That's what "deterministic" means in operational terms.
For Each loop — billing dispatch (the one you asked for)
A bw.generalactivities.Iterate group, sequential mode, with a nested Choice and a Sleep-based retry. Real-world control-flow density.
BillingDispatch.bwp
1 <bw:process name="BillingDispatch" xmlns:bw="..."> 2 3 <bw:starter name="ReceiveInvoice" type="bw.jms.JMSQueueReceiver"> 4 <bw:property name="QueueName">INVOICE.READY</bw:property> 5 </bw:starter> 6 7 <bw:group name="ForEachLineItem" type="bw.generalactivities.Iterate"> 8 <bw:property name="IterationVariable">lineItem</bw:property> 9 <bw:property name="OverExpression">$Invoice/LineItems/Item</bw:property> 10 <bw:property name="Sequence">true</bw:property> 11 <bw:property name="IndexSlot">$_lineIdx</bw:property> 12 <bw:property name="AccumulateOutput">true</bw:property> 13 14 <bw:activity name="ChargeAccount" type="bw.http.SendHTTPRequest"> 15 <bw:property name="Url">https://billing.svc/charge</bw:property> 16 <bw:property name="Method">POST</bw:property> 17 <bw:input> 18 <body> 19 <sku>$lineItem/sku</sku> 20 <qty>$lineItem/qty</qty> 21 <unitPrice>$lineItem/unitPrice</unitPrice> 22 </body> 23 </bw:input> 24 </bw:activity> 25 26 <bw:activity name="CheckChargeResult" type="bw.generalactivities.Choice"> 27 <bw:branch condition="$ChargeAccount/statusCode = 200" to="LogSuccess"/> 28 <bw:branch condition="$ChargeAccount/statusCode >= 500" to="RetryAfterDelay"/> 29 <bw:branch condition="true" to="LogFailure"/> 30 </bw:activity> 31 32 <bw:activity name="RetryAfterDelay" type="bw.generalactivities.Sleep"> 33 <bw:property name="DurationMs">2000</bw:property> 34 </bw:activity> 35 36 <bw:activity name="LogSuccess" type="bw.generalactivities.WriteToLog"> 37 <bw:property name="Level">INFO</bw:property> 38 <bw:property name="Message">Charged $lineItem/sku for $lineItem/qty units</bw:property> 39 </bw:activity> 40 41 <bw:activity name="LogFailure" type="bw.generalactivities.WriteToLog"> 42 <bw:property name="Level">ERROR</bw:property> 43 <bw:property name="Message">Failed to charge $lineItem/sku, status $ChargeAccount/statusCode</bw:property> 44 </bw:activity> 45 46 </bw:group> 47 48 <bw:activity name="PublishSummary" type="bw.jms.JMSQueueSender"> 49 <bw:property name="QueueName">INVOICE.DISPATCHED</bw:property> 50 </bw:activity> 51 </bw:process>
infra/state-machines/billing-dispatch.asl.json
1 { 2 "Comment": "Generated from BillingDispatch.bwp · rule-set 4.7", 3 "StartAt": "ReceiveInvoice", 4 "QueryLanguage": "JSONata", 5 "States": { 6 "ReceiveInvoice": { 7 "Type": "Task", 8 // src: BillingDispatch.bwp#L3 rule: JMS_RECEIVE_TO_SQS@2.1 9 "Resource": "arn:aws:states:::aws-sdk:sqs:receiveMessage", 10 "Arguments": { "QueueUrl": "{% $queues.invoice_ready %}" }, 11 "ResultPath": "$.invoice", 12 "Next": "ForEachLineItem" 13 }, 14 "ForEachLineItem": { 15 "Type": "Map", 16 // src: BillingDispatch.bwp#L7 rule: ITERATE_SEQ_TO_STEPFN_MAP@2.0 17 "ItemsPath": "$.invoice.lineItems", 18 "MaxConcurrency": 1, // from Sequence=true 19 "ItemSelector": { 20 "lineItem.$": "$$.Map.Item.Value", 21 "lineIdx.$": "$$.Map.Item.Index" // from IndexSlot 22 }, 23 "ItemProcessor": { 24 "ProcessorConfig": { "Mode": "INLINE" }, 25 "StartAt": "ChargeAccount", 26 "States": { 27 "ChargeAccount": { 28 "Type": "Task", 29 // src: BillingDispatch.bwp#L14 rule: HTTP_TO_STEPFN_HTTP@1.2 30 "Resource": "arn:aws:states:::http:invoke", 31 "Arguments": { 32 "ApiEndpoint": "https://billing.svc/charge", 33 "Method": "POST", 34 "RequestBody": { 35 "sku": "{% $lineItem.sku %}", 36 "qty": "{% $lineItem.qty %}", 37 "unitPrice": "{% $lineItem.unitPrice %}" 38 }, 39 "Authentication": { "ConnectionArn": "{% $env.BILLING_CONN %}" } 40 }, 41 "ResultPath": "$.charge", 42 "Next": "CheckChargeResult" 43 }, 44 "CheckChargeResult": { 45 "Type": "Choice", 46 // src: BillingDispatch.bwp#L26 rule: CHOICE_TO_STEPFN_CHOICE@1.5 47 "Choices": [ 48 { "Variable": "$.charge.StatusCode", "NumericEquals": 200, "Next": "LogSuccess" }, 49 { "Variable": "$.charge.StatusCode", "NumericGreaterThanEquals": 500, "Next": "RetryAfterDelay" } 49 ], 50 "Default": "LogFailure" 51 }, 52 "RetryAfterDelay": { 53 "Type": "Wait", 54 // src: BillingDispatch.bwp#L32 rule: SLEEP_TO_STEPFN_WAIT@1.0 55 "Seconds": 2, 56 "Next": "ChargeAccount" // retry loop preserved 57 }, 58 "LogSuccess": { 59 "Type": "Pass", 60 // src: BillingDispatch.bwp#L36 rule: LOG_TO_PASS_LOG@1.0 61 "Parameters": { 62 "level": "INFO", 63 "msg.$": "States.Format('Charged {} for {} units', $.lineItem.sku, $.lineItem.qty)" 64 }, 65 "End": true 66 }, 67 "LogFailure": { 68 "Type": "Pass", 69 "Parameters": { 70 "level": "ERROR", 71 "msg.$": "States.Format('Failed {}, status {}', $.lineItem.sku, $.charge.StatusCode)" 72 }, 73 "End": true 74 } 75 } 76 }, 77 "ResultPath": "$.charges", 78 "Next": "PublishSummary" 79 }, 80 "PublishSummary": { "Type": "Task", /* ... */ "End": true } 81 } 82 }
Lineage — the loop, the choice, the retry
| Source construct | Rule | Target construct |
|---|---|---|
<bw:group type="bw.generalactivities.Iterate"> at L7 | ITERATE_SEQ_TO_STEPFN_MAP@2.0 | Type: "Map" at L15 |
Sequence=true at L10 | (rule sub-rule) | MaxConcurrency: 1 at L18 |
IterationVariable=lineItem at L8 | (rule sub-rule) | ItemSelector.lineItem at L20 |
OverExpression=$Invoice/LineItems/Item at L9 | XPATH_TO_JSONPATH@1.8 | ItemsPath: "$.invoice.lineItems" at L17 |
IndexSlot=$_lineIdx at L11 | (rule sub-rule) | ItemSelector.lineIdx at L21 |
AccumulateOutput=true at L12 | (rule sub-rule) | ResultPath: "$.charges" at L77 |
<bw:activity ChargeAccount> (HTTP POST) at L14 | HTTP_TO_STEPFN_HTTP@1.2 | States.ChargeAccount at L27 |
<bw:activity CheckChargeResult> 3-way Choice at L26 | CHOICE_TO_STEPFN_CHOICE@1.5 | 2× Choices[] + Default at L44 |
<bw:activity RetryAfterDelay type=Sleep DurationMs=2000> at L32 | SLEEP_TO_STEPFN_WAIT@1.0 | Type: "Wait", Seconds: 2 at L52 |
Branch L28 to="RetryAfterDelay" → implicit continue | RETRY_LOOPBACK@1.0 | RetryAfterDelay.Next: "ChargeAccount" at L56 |
WriteToLog at L36, L41 | LOG_TO_PASS_LOG@1.0 | States.LogSuccess, States.LogFailure |
Sequence=trueisn't translated to a comment — it's translated to a functional constraint,MaxConcurrency: 1. If you flip it tofalse, the rule emits a Map with no concurrency cap and the lineage row changes fromITERATE_SEQ_TO_STEPFN_MAPtoITERATE_PAR_TO_STEPFN_MAP. Same node, different rule.- The retry topology (Sleep → loop back to ChargeAccount) is preserved structurally. The rule writes
RetryAfterDelay.Next = "ChargeAccount", not a comment saying "this should retry." Replay it: the 5xx path actually loops. AccumulateOutput=truewould be easy to silently drop — that's exactly the kind of detail an LLM-led migration misses. The rule preserves it asResultPath: $.charges, so the Map's accumulated array survives. Lineage row catches it.
Parallel group — three concurrent checks
Credit check + fraud check + KYC, fan-out then join. Tests that the engine preserves concurrency semantics.
OnboardCustomer.bwp (excerpt)1 <bw:group name="ParallelEnrich" type="bw.generalactivities.Parallel"> 2 <bw:branch> 3 <bw:activity name="CreditCheck" type="bw.http.SendHTTPRequest"> 4 <bw:property name="Url">https://credit.svc/check</bw:property> 5 </bw:activity> 6 </bw:branch> 7 <bw:branch> 8 <bw:activity name="FraudCheck" type="bw.http.SendHTTPRequest"> 9 <bw:property name="Url">https://fraud.svc/score</bw:property> 10 </bw:activity> 11 </bw:branch> 12 <bw:branch> 13 <bw:activity name="KYCCheck" type="bw.jdbc.JDBCQuery"> 14 <bw:property name="SQL">SELECT status FROM kyc WHERE customer_id=?</bw:property> 15 </bw:activity> 16 </bw:branch> 17 </bw:group>
onboard-customer.asl.json (excerpt)1 "ParallelEnrich": { 2 "Type": "Parallel", 3 // src: OnboardCustomer.bwp#L1 rule: PARALLEL_GROUP_TO_STEPFN_PARALLEL@1.4 4 "Branches": [ 5 { 6 "StartAt": "CreditCheck", 7 "States": { "CreditCheck": { 8 "Type": "Task", "Resource": "arn:aws:states:::http:invoke", 9 "Arguments": { "ApiEndpoint": "https://credit.svc/check", "Method": "POST" }, 10 "End": true }} 11 }, 12 { "StartAt": "FraudCheck", "States": { /* ... */ }}, 13 { "StartAt": "KYCCheck", "States": { /* ... */ }} 14 ], 15 "ResultSelector": { 16 "credit.$": "$[0]", "fraud.$": "$[1]", "kyc.$": "$[2]" 17 }, 18 "Next": "DecideEligibility" 19 }
bw.generalactivities.Parallel group joins implicitly when all branches finish. Step Functions Parallel has the same semantic. ResultSelector projects each branch's output to a named slot — the rule preserves the BW branch index order so downstream DecideEligibility sees credit, fraud, kyc exactly as the original did.
XSLT mapper → JSONata
A BW Mapper activity body, including xsl:for-each, xsl:choose, and an aggregate expression.
OrderToShipment.xsl1 <xsl:template match="/Order"> 2 <Shipment> 3 <orderId><xsl:value-of select="@id"/></orderId> 4 <customer> 5 <name><xsl:value-of select="Customer/Name"/></name> 6 <tier><xsl:value-of select="Customer/Tier"/></tier> 7 </customer> 8 <items> 9 <xsl:for-each select="LineItems/Item"> 10 <item> 11 <sku><xsl:value-of select="@sku"/></sku> 12 <weight><xsl:value-of select="Weight * Quantity"/></weight> 13 </item> 14 </xsl:for-each> 15 </items> 16 <totalWeight> 17 <xsl:value-of select="sum(LineItems/Item/Weight * LineItems/Item/Quantity)"/> 18 </totalWeight> 19 <priority> 20 <xsl:choose> 21 <xsl:when test="Customer/Tier = 'GOLD'">EXPRESS</xsl:when> 22 <xsl:otherwise>STANDARD</xsl:otherwise> 23 </xsl:choose> 24 </priority> 25 </Shipment> 26 </xsl:template>
mappers/order-to-shipment.jsonata1 /* Generated from OrderToShipment.xsl · rule: XSLT_TO_JSONATA@2.1 */ 2 { 3 "orderId": $.order.id, 4 "customer": { 5 "name": $.order.customer.name, 6 "tier": $.order.customer.tier 7 }, 8 "items": $.order.lineItems.{ 9 "sku": sku, 10 "weight": weight * quantity 11 }, 12 "totalWeight": $sum($.order.lineItems.(weight * quantity)), 13 "priority": $.order.customer.tier = "GOLD" ? "EXPRESS" : "STANDARD" 14 }
XPath / XSLT construct → JSONata rule table
| XSLT construct | JSONata equivalent | Rule |
|---|---|---|
<xsl:value-of select="@id"/> (L3) | $.order.id | XPATH_ATTR_TO_JSONATA@1.4 |
<xsl:value-of select="Customer/Name"/> (L5) | $.order.customer.name | XPATH_TO_JSONATA@1.4 |
<xsl:for-each select="LineItems/Item"> (L9) | $.order.lineItems.{...} | FOREACH_TO_JSONATA_MAP@1.2 |
Weight * Quantity (L12) | weight * quantity | XPATH_ARITH_TO_JSONATA@1.0 |
sum(... Weight * Quantity) (L17) | $sum($.order.lineItems.(weight * quantity)) | XPATH_SUM_TO_JSONATA@1.0 |
<xsl:choose> / <xsl:when> (L20-23) | tier = "GOLD" ? "EXPRESS" : "STANDARD" | XSLT_CHOOSE_TO_JSONATA_TERNARY@1.1 |
sum/count/avg, choose, for-each). What is not in the rule library — XSLT keys, doc(), document(), custom XSLT extensions, key-based joins — is flagged, not silently translated. The flag appears in the verification report as HUMAN_REVIEW_REQUIRED with the file and line, and the migration is not marked complete until a reviewer signs off. We don't ask the LLM to "be creative" with XSLT extensions; we ask a human.
Java code activity → Lambda
The case people assume needs an LLM. It doesn't — for the common case the rule preserves the Java verbatim. The LLM only enters when the body references a BW-internal class.
OrderIngest.bwp (Java activity body)1 <bw:activity name="CalculateChecksum" type="bw.generalactivities.JavaCode"> 2 <bw:input> 3 <payload xsi:type="xsd:string"/> 4 </bw:input> 5 <bw:output> 6 <checksum xsi:type="xsd:string"/> 7 </bw:output> 8 <bw:javacode> 9 import java.security.MessageDigest; 10 import java.nio.charset.StandardCharsets; 11 12 public String execute(String payload) throws Exception { 13 MessageDigest md = MessageDigest.getInstance("SHA-256"); 14 byte[] hash = md.digest(payload.getBytes(StandardCharsets.UTF_8)); 15 StringBuilder sb = new StringBuilder(); 16 for (byte b : hash) { 17 sb.append(String.format("%02x", b)); 18 } 19 return sb.toString(); 20 } 21 </bw:javacode> 22 </bw:activity>
lambda/calculate-checksum/Handler.java1 // Generated from CalculateChecksum activity · rule: JAVACODE_TO_LAMBDA@2.0 2 // Body preserved verbatim · src-body-hash: d1e2f3a4... 3 package com.cognidev.generated; 4 5 import com.amazonaws.services.lambda.runtime.Context; 6 import com.amazonaws.services.lambda.runtime.RequestHandler; 7 import java.security.MessageDigest; 8 import java.nio.charset.StandardCharsets; 9 import java.util.Map; 10 import java.util.HashMap; 11 12 public class Handler implements RequestHandler<Map<String,String>, Map<String,String>> { 13 14 // Contract gate: input key 'payload' matches bw:input/payload@xsd:string 15 public Map<String,String> handleRequest(Map<String,String> in, Context ctx) { 16 try { 17 String checksum = execute(in.get("payload")); 18 Map<String,String> out = new HashMap<>(); 19 out.put("checksum", checksum); // matches bw:output/checksum@xsd:string 20 return out; 21 } catch (Exception e) { throw new RuntimeException(e); } 22 } 23 24 // === BW body — preserved verbatim === 25 public String execute(String payload) throws Exception { 26 MessageDigest md = MessageDigest.getInstance("SHA-256"); 27 byte[] hash = md.digest(payload.getBytes(StandardCharsets.UTF_8)); 28 StringBuilder sb = new StringBuilder(); 29 for (byte b : hash) { 30 sb.append(String.format("%02x", b)); 31 } 32 return sb.toString(); 33 } 34 }
for (byte b : hash) loop is preserved byte-for-byte. The hash of the source body is recorded in the evidence pack so an auditor can prove no rewrite happened. The exception — and the place the LLM does earn its keep — is when the BW Java references a vendor-internal class like com.tibco.pe.plugin.SubprocessInvoker. The rule library has no Lambda equivalent. The LLM gets one shot at proposing a replacement; the proposal must compile, match the input/output schema, and produce identical output on the replay set. If it fails any of those, the activity is flagged for human review. There's no "the LLM tries again with more context" loop on a failing contract.
Schema equivalence — how we prove semantic match
Every migration emits a schema-diff.json that compares the source XSD field-by-field against the target JSON Schema. This is the artifact a customer's architect reads first.
1 { 2 "schema": "Order", 3 "source": { "file": "/Schemas/Order.xsd", "hash": "7e2d…" }, 4 "target": { "file": "/schemas/order.schema.json", "hash": "a91f…" }, 5 "fields": [ 6 { "path": "Order/@id", "src": "xsd:string", "tgt": "string", "status": "MATCH" }, 7 { "path": "Order/Customer/Name", "src": "xsd:string", "tgt": "string", "status": "MATCH" }, 8 { "path": "Order/Customer/Tier", "src": "xsd:string", "tgt": "string", "status": "MATCH" }, 9 { "path": "Order/LineItems/Item[]/@sku", "src": "xsd:string", "tgt": "string", "status": "MATCH" }, 10 { "path": "Order/LineItems/Item[]/Weight", "src": "xsd:decimal", "tgt": "number", "status": "MATCH" }, 11 { "path": "Order/LineItems/Item[]/Quantity", "src": "xsd:int", "tgt": "integer", "status": "MATCH" }, 12 { "path": "Order/@createdAt", "src": "xsd:dateTime", "tgt": "string (date-time)", "status": "MATCH_WITH_FORMAT_HINT" } 13 ], 14 "summary": { "match": 7, "format_hint": 1, "mismatch": 0, "unmapped": 0 }, 15 "replay": { "samples": 12, "passed": 12, "failed": 0, "divergence_fields": [] } 16 }
MATCH_WITH_FORMAT_HINT means the values are equivalent but expressed differently (XSD xsd:dateTime → JSON Schema string with format: date-time). The replay block proves it with sample data: 12 input messages from the source side were run through both the original BW process and the generated Step Function, with field-by-field diffs of the resulting outputs. Zero divergence means the migration is semantically equivalent for the sample set; the same harness keeps running in CI after deploy.
The evidence pack — what an auditor receives
Every migration ships a signed evidence bundle. An auditor opens it and answers their own questions; no presentation needed.
source-tree.json— every source file with hash, size, and parse statusdependency-graph.json— the full DAG with all transitions and shared resourcesrules-applied.csv— one row per node:source_file,source_line,node_type,rule_id,rule_version,target_file,target_lineschema-diff.json— field-by-field XSD ↔ JSON Schema matchreplay-results.json— sample-message replay with field-level divergence (zero or otherwise)llm-proposals.jsonl— every LLM call: prompt, response, contract-gate verdict (accepted / rejected with reason). Empty if the project had no long-tail bodies.human-review.md— activities flagged for human review with file, line, reasonrule-library-manifest.json— every rule used, with version, hash, and a link to its source in the published librarysignature.sig— signature of the bundle so downstream consumers can verify integrity
A reviewer can pick any row in rules-applied.csv, open the cited rule in the manifest, open the source and target files at the cited lines, and verify with their own eyes that the rule did what it claims. That's what we mean by audit-grade.
What we don't claim
A document going to a customer is worthless if it overclaims. Here is what this engine is not:
- Not 100% automated for every BW project. Custom XSLT extensions, vendor-internal Java classes, exotic adapter configs, and rarely-used activities are flagged for human review. The flag itself is part of the evidence pack — you see it before you ship.
- Not "drop-in zero-effort." Customers still wire up IAM, VPCs, secrets, observability, and any business-side decisions that the BW project didn't make explicit (e.g. DLQ retention, max-receive count). The engine emits sensible defaults; you tune them.
- Not a behavioural-equivalence proof for every input. Replay equivalence is over the sample set, which we make as wide as we can reach. Production-grade testing for safety-critical workloads still needs a domain-specific test plan.
- Not a guarantee that AWS will price exactly like TIBCO. Cost shape changes — Amazon MQ vs EMS, Lambda vs persistent JVM, Step Functions Express vs Standard — we document the trade-offs; you choose.
- The LLM is bounded, not absent. It writes names, comments, fixtures, narratives, and proposes long-tail replacements behind a contract gate. We claim "deterministic structure with bounded synthesis," not "no LLM in the building."
If any of these matter, ask. If your project is the kind where they don't, the engine ships a complete, reviewable migration with audit-grade evidence — and the LLM stays where it belongs.