2012年7月20日

What IT System can learn from US Transportation System?



I've travelled to US for 2 weeks at May this year, and was deeply impressed by the high effiency of US Transportation System.
It's said that US is A Country On Wheels, and somehow the Transportation System's effiency affects its Labor Productivity, and the Road Network also improves the balance between city and country.
Here from performance point view, I want to list out some learnings from US Transportation System, which might help us to improve IT System.

1. Read/Write Separation
US roads mainly includes Interstate Road, State Road, and Country Road.
For Interstate Road, it's the main road for long driving Trucks, and a wide Buffer Zone is setup between the 2 directions. For State Road, it might not setup Buffer Zone, but a Mark Line is painted for separation. And as for Country Road, it doesn't need Buffer Zone or Mark Line because of its low traffic.
To understand it in IT language, this design might be regarded as Read/Write Separation.
Considering business request, both Read & Write operations are part of normal procedure, they are not special logicically. But if Read & Write happens into the same Database Object, then we must consider impact of concurrency: Write operation might lock the Table, thus makes Read Operation waiting.
According to Object-Oriented Methodology, if both Read & Write operations are for same Object, then after mapping, both opertions should be defined in the same Table, and this mapping is logically simple.
For example, if we need to read Attribute 1 and to write Attribute 2 of Object A, then after mapping, it equals to Read Column 1 of Table A, and to Write Column 2 of Table A. Since these 2 Attributes are defined in the same Table A, then Writing Column 2 will lock the records of Table A, and Reading Column 1 will be held until the lock released(after Commint/Rollback). So this design might impact performance.
Let's take an example of Manufacturing System.
For Object of Work Order, it has some Meta Information such as Scheduled Date, Product Number, Quantity, and it also has Non-Meta information such as Current Manufacturing Station, Current Stock Location.
From begining of Manufacturing, the Meta information will not be changed, it will only be accessed from Read operation, but the frequency might be high, because every station requests to Read it.
While its Current Manufacturing Station and Current Stock Location will be changed a lot of times, so they will be updated every time, and each time it updates, this kind of Write Operation will lock records of Work Order Table, so the Reading of Meta data might be delayed.
So considering performance, we should define Work Order's Meta information into one Table, and define Current Manufacturing Station, Current Stock Location and other frequently changing Attributs into another Table or other Tables, as a result, the Read & Write operation has been separated, so the Write operation will not impact Read operation.

2. Redundancy
There're 3 examples, which might be understood from concept of Redundancy of IT System.
1) Buffer Zone of City Road
Buffer Zone is setup between 2 directions in City Road, and it's getting narrow right before Street Cross, and an extra Lane appears for Left Turn cars waiting.
The advantage of this design is that the Left Turn cars wouldn't affect direct traffic, after all direct traffic impact total traffic mostly, so its priority should be highest.
2) Waiting Zone of small road
Cars driving in the main road will not slow down before Street Cross with Green Light signal, it will keep normal speed, for the main road has the highest driving priority.
As for cars drving in small road, they might be waiting for quite a long time before getting into main road. So the Waiting Zone should be designed with capacity of having quite a lot of cars waiting.
3) STOP Board
While cars are driving at community, they should be stopped at STOP Board literally - even no one in sight, and drivers must make sure it't clear before continues driving.
This act seems like an obvious redundancy, for the rate of cars meeting is not high for most of time, but the law orders drivers to stop, why?
Because even the incident rate is low, but as a typical Black Swan incident, it's very hard to forecast, and it will be the Bottle Neck while it happens.
So following this rule, although people spend more time at road, but the incident rate is dropped down, so it's acceptable.

As for IT System, the Redundancy design is an effective method to improve performance.
Take examples of Database Query performance, usually there're 3 cases might affect performance.
1) Group By Query
According to design method of Database, during Group By Query, it's quickest while the type of Group By Column is Number Type. And the Query time might be hundreds or thousands of times longer - if the type of Group By Column is Char.
So if business requires Group By Query, at first we should add a redundant Column, which Type is Number, then we map the Number with original Char value. So we Query data Group By Number Column, and then we Join this Dataset with mapping Table, to get result finally.
As from Business point view, this extra Number Column is redundant, but it's a Must to improve performance.
2) Full-table scan
Full-table scan is a very common scenario, it consumes a lot of Query time, and the solution is very simple: adding Index.
From business point view, Index is also redundant, but the affection is great.
3) Recursive Query
Recursive Query is Database's weak point, it consumes a lot of time even Database has Built-in Functions for it.
A common example is BOM structure. Usually BOM uses multiple levels of Parent-Child relationship to build structure, so it requires a lot of Recursive Query to get the whold structure, and it's getting much longer while quantity or level increses.
For this kind of Query, a common solution is prepare the data before business required, by querying BOM, and saving it into a Flat Table or a Materialized View, and the business logic could Query from this Flat Table or Materialized View later.
That means, to use redundant time before buniness required, to use redundat data outside business logic, prepare data in advance, so that we can reduce Query time while business happened.

没有评论: