Dealing with not so potent Idempotent values.

Idempotent values are hard to come by [in relational database terms these are what we call "primary keys"]; they uniquely identify some record, or value, or object. At least you are certain they do... until they don't. The great big world has a habit of not respecting the idem-potency of what you declare to be idempotent; something out there just recycles it anyway. This is no less true in IT systems like ERP applications than it is anywhere else.

An ERP package I work with defines a workorder document to have a workorder number that is 9 digits. And when you get to the end of that nine digit range... yea, you saw that coming... it starts over again at the bottom. Nine digits is a lot, but with dozens of people pounding away 40+ hours a week nine digits only lasts about five years. The trouble comes in that the data is exported from that ERP application and then sent to an OIE work-flow which performs [after some transformation] an sqlUpsertAction into a table in a relational database. That data is then used for a variety of other work dispatch and decision support applications.

Why an sqlUpsertAction rather than an sqlInsertAction? An obvious question. As any experienced data grinder knows: no process is ever quite as straight-forwards as it first appears it should be. This ERP applications handles credits [you know - those things you have to issue when someone screws up] by reissuing the same workorder number the next day as all negative values, and then reissuing the same workorder number the day after that with the new corrected values. So if you process the stream of the workorder's segments in date order you will end up with the correct final values [you really should use a CQRS model and journalize your data changes]. That is a bit odd, but not a problem, the primary key for the record in the relational database table is still workorder number + workorder segment number. All the other fields acquired from the input stream can just be updated (including the date) and you'll be left with a coherent representation of the current state of the workorder. Simple... Until that workorder number rolls over. Then your sqlUpsertAction either finds records that do no actually represent the same document the user is looking at, or [virtuously] it fails with a referential constraint violation and an exception report is sent.

So workorder number + workorder segment number is only sorta kinda an idempotent value [primary key]. And the data stream, ultimately, does not contain any idempotent value which identifies THIS DOCUMENT definitively. Or at least not a consistently idempotent value, it is still a valid primary key within a certain scope! That it has a scope - there in lies the solution to this type of problem.

Fortunately sqlUpsertAction has a WHERE parameter that allows 'manually' specified scope to be applied to the SELECT/INSERT/UPDATE command sequence that will be used to create the relational modification. The keys and values are still used in the same manner - except the extra qualificiation specified in the WHERE parameter is added onto the query. Previously my BPML sqlUpsertAction stanza looked like:

<action name="MXType2RecordCollector" id="000030" 
        extensionAttributes="MXType2RecordCollector/000030">
  <output>
    <source property="TransformedData"/>
  </output> 
  <attributes xmlns=""> 
    <extension name="activityName">sqlUpsertAction</extension> 
    <extension name="dataSource">miecr@BARNET</extension> 
    <extension name="description"/> 
  </attributes> 
</action>

With that stanza the SELECT statement looked like: "SELECT workorder, workorder_segment FROM tableName WHERE workorder = :0 AND workorder_segment = :1". If one and only one record was found an UPDATE like UPDATE tableName SET ... WHERE workorder = :0 AND workorder_segment = :1 would be performed, if zero records were found an appropriate INSERT would be performed, and if more than one records were found by the SELECT an error would be raised [indicating your use of they keys does not match the existing data - your keys are not idempotent]. This performed flawlessly for years. Then the ERP application quietly one day reset the workorder number sequence. BLAM!

The revised sqlUsertAction stanza using the WHERE parameter to apply scope to the SELECT and UPDATE is:

<action name="MXType2RecordCollector" id="000030"
        extensionAttributes="MXType2RecordCollector/000030">
  <output>
    <source property="TransformedData"/>
  </output>   
  <attributes xmlns=""> 
    <extension name="activityName">sqlUpsertAction</extension>
    <extension name="dataSource">miecr@BARNET</extension> 
    <extension name="where">event_date > TODAY - 730 UNITS DAY</extension>
    <extension name="description"/> 
  </attributes>     
</action>

Now the SELECT looks like: SELECT workorder, workorder_segment FROM tableName WHERE workorder = :0 AND workorder_segment = :1 AND event_date > TODAY - 730 UNITS DAY. And a corresponding UPDATE, if a record is found, looks like: UPDATE tableName SET ... WHERE workorder = :0 AND workorder_segment = :1 AND event_date > TODAY - 730 UNITS DAY".

So with two clearly documented provisos -

  1. The workorder sequence does not roll over in two years.
  2. The life-cycle of no workorder [issue, credit, re-issue] exceeds two years

this small change allows the data to continue to be streamed by the OIE workflow into the relational database table. The condition provided by the WHERE attribute keeps the scope of the already existing data considered for update limited by currency [time]. The WHERE parameter allows the elegance of the sqlUsertAction to be deployed even when dealing with only-sorta-kinda primary keys; frequently those are the only primary keys one has.