2012年7月21日

Walking in the Air中文歌词.lrc



儿子喜欢听Walking in the Air,但还不会唱这么长的英文歌,于是做了翻译,加工成中文版。
根据Connie Talbot版本制作伴奏带:


[00:02.22]空中漫步
[00:05.45]中文翻译:tallrain
[00:08.12]
[00:10.34]我们漫步在空中
[00:15.31]飘在月下的天空
[00:22.80]看见了人们
[00:25.32]睡在香甜的梦中
[00:29.94]
[00:32.71]紧紧地拥抱
[00:37.82]在蓝色的夜空驰骋
[00:45.18]一起向上飞
[00:47.66]飞向遥远的星辰
[00:54.71]
[01:10.25]穿过这世界
[01:15.29]村庄消失在梦中
[01:22.81]河流和山丘
[01:25.23]森林溪水一起走
[01:29.99]
[01:33.06]孩子们张着嘴
[01:37.91]惊讶地看着
[01:42.82]没有人能相信
[01:47.48]他的眼睛
[01:51.48]
[01:51.63]遨游在空中
[01:56.44]游过冰冻的天空
[02:03.99]巨大的冰山
[02:06.87]飘过我们的身边
[02:11.43]
[02:29.01]突然间跳进了
[02:34.20]深深的大海 
[02:39.02]沉睡的鲸鱼呀
[02:44.13]跳向了天空 
[02:48.75]
[02:50.28]漫步在空中
[02:55.26]在这夜空中跳舞
[03:02.82]看见了人们
[03:05.26]送上他们的祝福
[03:11.07]
[03:12.74]漫步在空中
[03:17.53]漫步在空中

2012年7月20日

What IT System can learn from US Transportation System?



I've travelled to US for 2 weeks at May this year, and was deeply impressed by the high effiency of US Transportation System.
It's said that US is A Country On Wheels, and somehow the Transportation System's effiency affects its Labor Productivity, and the Road Network also improves the balance between city and country.
Here from performance point view, I want to list out some learnings from US Transportation System, which might help us to improve IT System.

1. Read/Write Separation
US roads mainly includes Interstate Road, State Road, and Country Road.
For Interstate Road, it's the main road for long driving Trucks, and a wide Buffer Zone is setup between the 2 directions. For State Road, it might not setup Buffer Zone, but a Mark Line is painted for separation. And as for Country Road, it doesn't need Buffer Zone or Mark Line because of its low traffic.
To understand it in IT language, this design might be regarded as Read/Write Separation.
Considering business request, both Read & Write operations are part of normal procedure, they are not special logicically. But if Read & Write happens into the same Database Object, then we must consider impact of concurrency: Write operation might lock the Table, thus makes Read Operation waiting.
According to Object-Oriented Methodology, if both Read & Write operations are for same Object, then after mapping, both opertions should be defined in the same Table, and this mapping is logically simple.
For example, if we need to read Attribute 1 and to write Attribute 2 of Object A, then after mapping, it equals to Read Column 1 of Table A, and to Write Column 2 of Table A. Since these 2 Attributes are defined in the same Table A, then Writing Column 2 will lock the records of Table A, and Reading Column 1 will be held until the lock released(after Commint/Rollback). So this design might impact performance.
Let's take an example of Manufacturing System.
For Object of Work Order, it has some Meta Information such as Scheduled Date, Product Number, Quantity, and it also has Non-Meta information such as Current Manufacturing Station, Current Stock Location.
From begining of Manufacturing, the Meta information will not be changed, it will only be accessed from Read operation, but the frequency might be high, because every station requests to Read it.
While its Current Manufacturing Station and Current Stock Location will be changed a lot of times, so they will be updated every time, and each time it updates, this kind of Write Operation will lock records of Work Order Table, so the Reading of Meta data might be delayed.
So considering performance, we should define Work Order's Meta information into one Table, and define Current Manufacturing Station, Current Stock Location and other frequently changing Attributs into another Table or other Tables, as a result, the Read & Write operation has been separated, so the Write operation will not impact Read operation.

2. Redundancy
There're 3 examples, which might be understood from concept of Redundancy of IT System.
1) Buffer Zone of City Road
Buffer Zone is setup between 2 directions in City Road, and it's getting narrow right before Street Cross, and an extra Lane appears for Left Turn cars waiting.
The advantage of this design is that the Left Turn cars wouldn't affect direct traffic, after all direct traffic impact total traffic mostly, so its priority should be highest.
2) Waiting Zone of small road
Cars driving in the main road will not slow down before Street Cross with Green Light signal, it will keep normal speed, for the main road has the highest driving priority.
As for cars drving in small road, they might be waiting for quite a long time before getting into main road. So the Waiting Zone should be designed with capacity of having quite a lot of cars waiting.
3) STOP Board
While cars are driving at community, they should be stopped at STOP Board literally - even no one in sight, and drivers must make sure it't clear before continues driving.
This act seems like an obvious redundancy, for the rate of cars meeting is not high for most of time, but the law orders drivers to stop, why?
Because even the incident rate is low, but as a typical Black Swan incident, it's very hard to forecast, and it will be the Bottle Neck while it happens.
So following this rule, although people spend more time at road, but the incident rate is dropped down, so it's acceptable.

As for IT System, the Redundancy design is an effective method to improve performance.
Take examples of Database Query performance, usually there're 3 cases might affect performance.
1) Group By Query
According to design method of Database, during Group By Query, it's quickest while the type of Group By Column is Number Type. And the Query time might be hundreds or thousands of times longer - if the type of Group By Column is Char.
So if business requires Group By Query, at first we should add a redundant Column, which Type is Number, then we map the Number with original Char value. So we Query data Group By Number Column, and then we Join this Dataset with mapping Table, to get result finally.
As from Business point view, this extra Number Column is redundant, but it's a Must to improve performance.
2) Full-table scan
Full-table scan is a very common scenario, it consumes a lot of Query time, and the solution is very simple: adding Index.
From business point view, Index is also redundant, but the affection is great.
3) Recursive Query
Recursive Query is Database's weak point, it consumes a lot of time even Database has Built-in Functions for it.
A common example is BOM structure. Usually BOM uses multiple levels of Parent-Child relationship to build structure, so it requires a lot of Recursive Query to get the whold structure, and it's getting much longer while quantity or level increses.
For this kind of Query, a common solution is prepare the data before business required, by querying BOM, and saving it into a Flat Table or a Materialized View, and the business logic could Query from this Flat Table or Materialized View later.
That means, to use redundant time before buniness required, to use redundat data outside business logic, prepare data in advance, so that we can reduce Query time while business happened.

美国公路系统对IT系统的借鉴



本人于今年5月到美国出了两周的公差,在此期间美国公路系统的发达高效给我留下了深刻的印象。
我们常听说美国是车轮上的国家,从某种意义上来说,公路系统的效率直接影响了国民的劳动生产率,而公路网的健全也促进了城乡之间的均衡。
这里我试图从性能的角度出发,记录几点美国公路系统对IT系统的借鉴。

一、读写分离
美国公路主要分州际公路、州内公路、乡村公路3个级别,大致对应中国的高速公路、国道、省道。
其中州际公路是卡车物流的要道,公路的两个方向之间有很宽的隔离带。州内公路虽然不一定设置隔离带,但是至少通过划线的方式区分行进方向。而乡村公路由于车流量小,往往只有一个车道,因此就没必要区分和标注了。
用IT的语言解读,这个设计相当于读写分离。
从需求的角度出发,读和写的操作都是正常流程的一部分,从逻辑上来理解并没有什么特殊性。但是如果读和写针对同一个数据库对象,那么我们就必须考虑并发性的影响:写操作造成的锁表会导致读操作的等待。
按照面向对象的方法论,如果读和写针对的是同一个对象,那么对象经过映射之后,所有操作针对的属性都应该放在同一个表中,那么这种映射关系的逻辑是最简洁的。
比如对于对象A,要读属性1,要写属性2,那么经过映射,相当于读表A的字段1,写表A的字段2。由于这两个属性放在同一个表中,那么对字段2的写操作会造成表A中这一行数据的锁保护,一直要等到锁保护解除(COMMIT/ROLLBACK)才能进行此行数据的读操作。因此这种设计可能会对性能造成影响。
举一个生产系统的例子。
比如工单这个对象,有工单的元信息如计划生产日期、产品号、数量等,此外还有工单当前所在制造工位和库位等信息。
一旦进入制造环节,那么工单的元信息就不会有变动,对它的操作只有读操作,但是很频繁,因为每个工位都会读取。
而工单的当前工位和库位会不断地变动,因此会不断地写,而每次的写操作都会造成数据行的锁保护,因此影响元信息数据读取。
因此从性能的角度出发,我们应该将工单的元信息放在一个表里,而将当前工位和库位等不断更新的属性放在另一个或若干个表里,从而实现了读写分离,这样写的操作不会影响读的操作。

二、冗余
美国公路系统中有3个例子,可以参照IT系统中冗余的概念进行理解。
1、城市公路中央隔离带
城市公路的2个走向之间有很宽的隔离带,当道路有转向时,隔离带区域缩小,留出一个车位的面积供左转向车等待时停泊。
这样设计的好处是转向车不会影响直行车,毕竟直行车对总体流量的影响是最大的,因为直行车道的畅通是优先级最高的。
2、小路等待区域
行驶在大路上的车辆,如果是绿灯,即使要经过路口,车辆的速度保持在正常行驶速度,司机不会减速行驶,因为大家都认为主路的行驶优先级高。
那么行驶在小路上的车辆,在驶入大路之前,可能会经过较长时间的等待,以避免和主道车辆争抢。因此小路在接入大路之前,所设计的区域应允许较多车辆的临时停留。
3、STOP牌
车辆行驶在住宅或办公区时,在与人车可能交汇的路口设置STOP停车警示牌,司机必须在此牌前将汽车减速停车(即使路口无人),确认安全后再通过。
这个举措表面上看起来是个明显的冗余,因为很多路口人车交汇的概率并不高,但是法律规定司机必须在此牌前停车,其意义何在呢?
这是因为事故出现的概率尽管很低,但是作为黑天鹅事件,是很难预测的,并且一旦出现事故就会造成堵塞交通的瓶颈。
因此尽管大家都增加了道路停留时间,但是恶性事故出现的概率降低了,因此这个冗余是合理的。

对于IT系统来说,冗余设计通常也是提升性能的有效方法。
以数据库查询性能来说,通常影响数据库查询的有以下3种情况:
1、分组查询
按照数据库的设计原理,当进行分组查询的时候,查询的字段类型为整型时效率是最高的,如果用字符型字段进行分组,则查询消耗时间会达成百上千倍之多。
因此当业务需要进行分组查询的时候,首先我们要在数据库里增加冗余字段,将要分组的字符型数据映射成整型数据,将数据按照整型字段汇总,然后将数据集与映射表关联,最终得到我们需要的业务数据。
从业务的角度出发,这个增加的整型字段就是冗余数据,但是它对提升性能是不可或缺的。
2、全表扫描
全表扫描是一个经常碰到的场景,会消耗大量的查询时间,解决方法很简单,是建立索引。
对于业务来说,索引也是冗余数据,但是效果立竿见影。
3、递归查询
对于递归的处理是数据库的弱项,即使数据库有一些处理递归的内置函数,但是往往还是会产生大量的查询时间。
一个常见的例子是BOM的结构。通常BOM采用多级父-子件关系来建立完整的结构,因此在还原时会有大量的递归查询,随着查询数量和层级的增加,查询时间会变得相当长。
对于此类查询,一个常见的解决办法是提前做数据处理,在业务发生之前,将BOM的结构查询出来,并且以展开的平表格式存储到表或物化视图中,这样业务发生时直接查询平表或物化视图即可。
也将是说,利用业务发生前的冗余时间,利用与业务逻辑无关的冗余数据,提前处理从而减少业务发生时的查询时间。