狭路相逢勇者胜: March 2008

Monday, March 31, 2008

TCP Slow Start, Congestion Avoidance, Fast Retransmit, and Fast Recovery Algorithms

1. Slow Start
2. Congestion Avoidance
3. Fast Retransmit
4. Fast Recovery

RFC 2001

http://www.faqs.org/rfcs/rfc2001.html

1.  Slow Start

 Old TCPs would start a connection with the sender injecting multiple
 segments into the network, up to the window size advertised by the
 receiver.  While this is OK when the two hosts are on the same LAN,
 if there are routers and slower links between the sender and the
 receiver, problems can arise.  Some intermediate router must queue
 the packets, and it's possible for that router to run out of space.
 [2] shows how this naive approach can reduce the throughput of a TCP
 connection drastically.

 The algorithm to avoid this is called slow start.  It operates by
 observing that the rate at which new packets should be injected into
 the network is the rate at which the acknowledgments are returned by
 the other end.

 Slow start adds another window to the sender's TCP:  the congestion
 window, called "cwnd".  When a new connection is established with a
 host on another network, the congestion window is initialized to one
 segment (i.e., the segment size announced by the other end, or the
 default, typically 536 or 512).  Each time an ACK is received, the
 congestion window is increased by one segment.  The sender can
 transmit up to the minimum of the congestion window and the
 advertised window.  The congestion window is flow control imposed by
 the sender, while the advertised window is flow control imposed by
 the receiver.  The former is based on the sender's assessment of
 perceived network congestion; the latter is related to the amount of
 available buffer space at the receiver for this connection.

 The sender starts by transmitting one segment and waiting for its
   ACK. When that ACK is received, the congestion window is incremented
   from one to two, and two segments can be sent.  When each of those
   two segments is acknowledged, the congestion window is increased to
   four.  This provides an exponential growth, although it is not
 exactly exponential because the receiver may delay its ACKs,
 typically sending one ACK for every two segments that it receives.

 At some point the capacity of the internet can be reached, and an
 intermediate router will start discarding packets.  This tells the
 sender that its congestion window has gotten too large.

 Early implementations performed slow start only if the other end was
 on a different network.  Current implementations always perform slow
 start.

2.  Congestion Avoidance

 Congestion can occur when data arrives on a big pipe (a fast LAN) and
 gets sent out a smaller pipe (a slower WAN).  Congestion can also
 occur when multiple input streams arrive at a router whose output
 capacity is less than the sum of the inputs.  Congestion avoidance is
 a way to deal with lost packets.  It is described in [2].

 The assumption of the algorithm is that packet loss caused by damage
 is very small (much less than 1%), therefore the loss of a packet
 signals congestion somewhere in the network between the source and
 destination.  There are two indications of packet loss:  a timeout
 occurring and the receipt of duplicate ACKs.

 Congestion avoidance and slow start are independent algorithms with
 different objectives.  But when congestion occurs TCP must slow down
 its transmission rate of packets into the network, and then invoke
 slow start to get things going again.  In practice they are
 implemented together.

 Congestion avoidance and slow start require that two variables be
 maintained for each connection: a congestion window, cwnd, and a slow
 start threshold size, ssthresh.  The combined algorithm operates as
 follows:

 1.  Initialization for a given connection sets cwnd to one segment
     and ssthresh to 65535 bytes.

 2.  The TCP output routine never sends more than the minimum of cwnd
     and the receiver's advertised window.

 3.  When congestion occurs (indicated by a timeout or the reception
     of duplicate ACKs), one-half of the current window size (the
     minimum of cwnd and the receiver's advertised window, but at
     least two segments) is saved in ssthresh.  Additionally, if the
     congestion is indicated by a timeout, cwnd is set to one segment
     (i.e., slow start).

 4.  When new data is acknowledged by the other end, increase cwnd,
     but the way it increases depends on whether TCP is performing
       slow start or congestion avoidance.

    If cwnd is less than or equal to ssthresh, TCP is in slow start;
    otherwise TCP is performing congestion avoidance.  Slow start
    continues until TCP is halfway to where it was when congestion
    occurred (since it recorded half of the window size that caused
    the problem in step 2), and then congestion avoidance takes over.

    Slow start has cwnd begin at one segment, and be incremented by
      one segment every time an ACK is received.  As mentioned earlier,
    this opens the window exponentially:  send one segment, then two,
    then four, and so on.  Congestion avoidance dictates that cwnd be
      incremented by segsize*segsize/cwnd each time an ACK is received,
    where segsize is the segment size and cwnd is maintained in bytes.
    This is a linear growth of cwnd, compared to slow start's
    exponential growth.  The increase in cwnd should be at most one
    segment each round-trip time (regardless how many ACKs are
    received in that RTT), whereas slow start increments cwnd by the
    number of ACKs received in a round-trip time.

 Many implementations incorrectly add a small fraction of the segment
 size (typically the segment size divided by 8) during congestion
 avoidance.  This is wrong and should not be emulated in future
 releases.

3.  Fast Retransmit (after 3 duplicate acks before the timer expires)

 Modifications to the congestion avoidance algorithm were proposed in
 1990 [3].  Before describing the change, realize that TCP may
 generate an immediate acknowledgment (a duplicate ACK) when an out-
 of-order segment is received (Section 4.2.2.21 of [1], with a note
 that one reason for doing so was for the experimental fast-
 retransmit algorithm).  This duplicate ACK should not be delayed.
 The purpose of this duplicate ACK is to let the other end know that a
 segment was received out of order, and to tell it what sequence
 number is expected.

 Since TCP does not know whether a duplicate ACK is caused by a lost
   segment or just a reordering of segments, it waits for a small number
   of duplicate ACKs to be received.  It is assumed that if there is
 just a reordering of the segments, there will be only one or two
 duplicate ACKs before the reordered segment is processed, which will
 then generate a new ACK.  If three or more duplicate ACKs are
  received in a row, it is a strong indication that a segment has been
  lost.  TCP then performs a retransmission of what appears to be the
  missing segment, without waiting for a retransmission timer to
  expire.

4.  Fast Recovery (after fast retransmit, congestion avoidance instead of slow start)

 After fast retransmit sends what appears to be the missing segment,
   congestion avoidance, but not slow start is performed.  This is the
 fast recovery algorithm.  It is an improvement that allows high
   throughput under moderate congestion, especially for large windows.

 The reason for not performing slow start in this case is that the
 receipt of the duplicate ACKs tells TCP more than just a packet has
 been lost.  Since the receiver can only generate the duplicate ACK
 when another segment is received, that segment has left the network
 and is in the receiver's buffer.  That is, there is still data
 flowing between the two ends, and TCP does not want to reduce the
 flow abruptly by going into slow start.

 The fast retransmit and fast recovery algorithms are usually
 implemented together as follows.

 1.  When the third duplicate ACK in a row is received, set ssthresh
       to one-half the current congestion window, cwnd, but no less
       than two segments.  Retransmit the missing segment. Set cwnd to
       ssthresh plus 3 times the segment size.  This inflates the
     congestion window by the number of segments that have left the
     network and which the other end has cached (3).

 2.  Each time another duplicate ACK arrives, increment cwnd by the
       segment size.  This inflates the congestion window for the
       additional segment that has left the network.  Transmit a
       packet, if allowed by the new value of cwnd.

 3.  When the next ACK arrives that acknowledges new data, set cwnd
     to ssthresh (the value set in step 1).  This ACK should be the
     acknowledgment of the retransmission from step 1, one round-trip
     time after the retransmission.  Additionally, this ACK should
     acknowledge all the intermediate segments sent between the lost
     packet and the receipt of the first duplicate ACK.  This step is
     congestion avoidance, since TCP is down to one-half the rate it
     was at when the packet was lost.

 The fast retransmit algorithm first appeared in the 4.3BSD Tahoe
 release, and it was followed by slow start.  The fast recovery
 algorithm appeared in the 4.3BSD Reno release.

Friday, March 28, 2008

cross linking problem with g++ and gcc

The names of C++ functions are "mangled" to include information about
the types of the the function's arguments.  To use C functions from C++
source or vice versa, they must be declared extern "C".  You can group
these with extern "C" { ... }

If you look at the standard header files, you will see things like

#ifdef __cplusplus
extern "C" {
#endif

[declarations of C functions]

#ifdef __cplusplus
}
#endif


or, more cleanly, define in some common header file

#ifdef __cplusplus
#define BEGIN_C_DECLS extern "C" {
#define END_C_DECLS }
#else
#define BEGIN_C_DECLS
#define END_C_DECLS
#endif

and then

BEGIN_C_DECLS

[declarations of C functions]

END_C_DECLS


You should follow a similar strategy in the header files
for your project.  Make sure that every global function
is declared in one of those header files, and that the
appropriate header files are included. Every function which
is defined or used in C code must be declared extern "C".
Try not to do so for functions that are only used in C++ code,
since that will lose type-safe checks, but it should otherwise be
harmless.  As long as every global function is declared in a header
file and each header file is always included in every file that uses
any functions from it, the link should succeed.  (These are basic
principles of C/C++ engineering, regardless of whether you are
merging the two).

Monday, March 24, 2008

Minimum Spanning Tree (MST)

Main ideas: keep adding edges into the MST.

1. Kruskal's Algorithm
2. Prim's Algorithm

A graph G = (V, E)

Kruskal's Algorithm (Union-Find Data structure)

1. Give each node a different color (we can simply use numbers as color/name).

2. Order all the weighted edges in increasing order.

3. Scan the edge list, if the edge does not incur loop, which means the end points have the same color, then add that edge into MST. And Make union of the nodes once we add one edge into the MST.

4. The process will stop if all the nodes have the same color.

Prim's Algorithm:

1. we maintain a set S on which a spanning tree has been constructed so far. Initially S= {s}.

2. In each iteration, we grow S by one node, adding the node v that minimizes the "attachment cost", and including the edge e = (u, v) that achieves this minimum in the spanning tree.

Tuesday, March 18, 2008

vim tab setting

vi/vim的Tab預設為8個空格，若要改變Tab的空格數，有兩種方法。

第一種：
進入vi/vim後，在Last Line Mode中輸入
set tapstop=4
Tab空格數就會變成4，然而只套用於目前設定。

第二種：
將set tapstop=4寫進~/.vimrc裡，則每次開啟vi/vim就會套用這個值。

WinSCP

http://winscp.net/eng/index.php

An open source SFTP client and FTP client for Windows. Its main function is the secure file transfer between a local and a remote computer. Beyond this, WinSCP offers basic file manager functionality. It uses Secure Shell (SSH) and supports, in addition to Secure FTP, also legacy SCP protocol.

Saturday, March 15, 2008

Windows File System

File Systems

A file system enables applications to store and retrieve files on storage devices. Files are placed in a hierarchical structure. The file system specifies naming conventions for files and the format for specifying the path to a file in the tree structure.

Each file system consists of one or more drivers and dynamic-link libraries that define the data formats and features of the file system. File systems can exist on many different types of storage devices, including hard disks, jukeboxes, removable optical disks, tape back-up units, and memory cards.

All file systems supported by Windows have the following storage components:

Volumes. A volume is a collection of directories and files.
Directories. A directory is a hierarchical collection of directories and files.
Files. A file is a logical grouping of related data.

Directory Management

A directory is a hierarchical collection of directories and files. The only constraint on the number of files that can be contained in a single directory is the physical size of the disk on which the directory is located.

http://msdn2.microsoft.com/en-us/library/aa364407(VS.85).aspx

Tuesday, March 11, 2008

Introduction to STL

a C++ library of container classes, algorithms, and iterators

STL is a generic library, meaning that its components are heavily parameterized: almost every component in the STL is a template.

STL includes container classes: classes whose purpose is to contain other objects. The STL includes the classes vector, list, deque, set, multiset, map, multimap, hash_set, hash_multiset, hash_map, and hash_multimap. Each of these classes is a template, and can be instantiated to contain any type of object.

STL also includes collection of algorithms that manipulate the data stored in containers. algorithms are decoupled from the STL container classes.

iterators are a generalization of pointers. Iterators are the mechanism that makes it possible to decouple algorithms from containers: algorithms are templates, and are parameterized by the type of iterator, so they are not restricted to a single type of container.

Concepts and Modeling
One very important question to ask about any template function is what the set of types is that may correctly be substituted for the formal template parameters. These requirements are sufficiently important that we give them a name: we call such a set of type requirements a concept, and we call this particular concept Input Iterator. We say that a type conforms to a concept, or that it is a model of a concept, if it satisfies all of those requirements. We say that int* is a model of Input Iterator because int* provides all of the operations that are specified by the Input Iterator requirements.

Refinement
Refinement of concepts is very much like inheritance of C++ classes; the main reason we use a different word, instead of just calling it "inheritance", is to emphasize that refinement applies to concepts rather than to actual types. The main purpose is to impose additional requirements.

Other parts of STL

the STL includes several utilities: very basic concepts and functions that are used in many different parts of the library. The concept Assignable, for example, describes types that have assignment operators and copy constructors; almost all STL classes are models of Assignable, and almost all STL algorithms require their arguments to be models of Assignable.

Some low-level mechanisms for allocating and deallocating memory: Allocator

the STL includes a large collection of function objects, also known as functors. Just as iterators are a generalization of pointers, function objects are a generalization of functions: a function object is anything that you can call using the ordinary function call syntax. Function objects are an important part of generic programming because they allow abstraction not only over the types of objects, but also over the operations that are being performed.

Reference:

http://www.sgi.com/tech/stl/stl_introduction.html

Monday, March 10, 2008

两只“800镑猩猩”的战争

谷歌如何避开微软？全世界最善于剿灭竞争对手的公司并展开反击？两只“800镑猩猩”的战争 www.6park.com

　　文《环球企业家》记者骆轶航 www.6park.com

　　也许，微软和雅虎真的需要更长的时间权衡这场交易背后的战略意义，以及它究竟应该值450亿美元、500亿美元，还是更多。

　　也许，这家仍统治着操作系统市场、历三十年而弥新的软件霸主，真的可以进行一次“昂贵，但完美的收购”，与互联网上唯一可与谷歌相提并论的品牌水乳交融，并像鲍尔默给雅虎股东的信中所说的那样，实现“规模效应”、“增强研发能力”、“提高运营效率”、创造“新兴用户体验”。但真正有意义的问题只有一个：未来的几个月乃一年，属于谷歌的时间会就此停止吗？ www.6park.com

　　至少对于微软和雅虎而言，时间已经静止太久了。人们记忆所及，微软在2.4亿美元投资Facebook之前的一次互联网领域的胜利，还是10年前击败网景的导航者浏览器。而雅虎，上一次被视为网络业霸主，大概是2000年互联网泡沫破碎前夕。正是在这莫名缺失的十年中，谷歌从斯坦佛大学的两名肄业生的宠物程序，变成了一家最高时拥有2000亿美元市值的“下一个微软”。 www.6park.com

　　而时至今日，最让微软忌惮的事情发生了，看上去，谷歌已不动声色地全方位地动摇了微软生存的根基，Android从侧面出其不意地瓦解了微软移动操作系统的合理性；它同时凭借“云计算”的攻势，让一切与计算相关的事情都绕过收费的软件，而通过互联网进行。如果无意将商业史简化，比如将微软错过搜索引擎崛起大潮简单视为行业规则改变，而这家老牌公司开始松懈，谷歌崛起故事的一个重要价值不在于其传奇色彩，而在于，它是在何种方法论指引下，壮大自身同时避过了微软对全行业的虎视眈眈。 www.6park.com

　　众所周知，自 1975年成立以来，微软对于软件及周边行业的侵略性从未减少过。正如关于甲骨文公司创始人拉里埃里森的传记Everyone Else Must Fail里说：这家位于西雅图雷德蒙的巨兽最独特的本领是“压制小公司的创新能力”。通过有效的学习与强有力的执行，微软将数以百计的同行击溃。而微软这种高进攻性曾让太多竞争对手几乎丧失了与其竞争的勇气。比如曾在升阳和Novell两家公司与微软有过多年竞争经验的施密特（Eric Schmidt），在担任谷歌CEO之前曾对媒体表示，“只是谈及微软已经对我的健康不利”。 www.6park.com

　　谷歌是如何超越所有前辈的命运的？至少从必要条件上，可以被列举的部分包括：有意识地改变游戏规则（它从不涉足微软最擅长的领域，即“销售软件”），持续夯实自身的技术及管理基础（在谷歌之前有哪家公司的人力资源管理如此受外界关注？），从不主动将自己称为挑战者（至今依然如此）。这让 2004年微软逐渐将谷歌视为最大竞争对手后，依然无法压制对方。 www.6park.com

　　而从微软的角度，不可否认的是，一些微妙的变化正在发生。奇虎董事长周鸿对《环球企业家》表示，他关注微软多年来，盖茨主导公司对竞争对手发起攻击前都是极为低调的，就像1995年时撰写《互联网大浪潮》（Intenet tidal wave）等内部通信时，即使微软已经统治了桌面操作系统，但盖茨仍让公司像创业者一样从零开始思考问题。但近年来，鲍尔默治下的微软从没有真正放低姿态，他会公开嘲笑谷歌是“只会一招的小马驹”，说“Facebook才有活力，谷歌已经没有了”（请登录gemag.com.cn参看《独角戏》），即当微软始终以一种批判性眼光看待对手，它究竟有多大的动力完成一场自我革命呢？ www.6park.com

　　“谷歌射中了我们的靶心” www.6park.com

　　2003年12月，比尔·盖茨在浏览谷歌招聘页面时，发现上面并没有太多关于搜索引擎类的职位，相反地，它正在招聘大量操作系统设计、编译器优化、分布式系统结构设计等工程师。众所周知，这些通常是微软需要的核心人才。盖茨不免心生疑虑：难道谷歌要对外发布一套操作系统？ www.6park.com

　　“我们真的有必要注意这帮家伙了”，在给全体员工的邮件中，盖茨写道：“他们似乎在做一些与我们竞争的事”。 www.6park.com

　　这是典型的盖茨式担忧。此前的28年里，他用了太多时间和全世界争夺操作系统市场。苹果、IBM、Novell甚至网景都多多少少给他带来了一些麻烦，但这些强劲的挑战也让反复证明着，只要电脑存在，操作系统始终是这些空洞盒子的灵魂。 www.6park.com

　　看上去，谷歌很像“又一个挑战者”。这家成立不过5年的公司，正在成为一颗超新星。在当时，遍布全球的网络用户都已习惯用谷歌搜索自己的种种疑问，而且，虽然没有明确披露财务数字，谷歌的收入正在高增长之中，已经不是硅谷的秘密。 www.6park.com

　　其实，就在2003年秋天，微软曾动过收购谷歌的念头，这是微软一贯常用的手段。不幸的是，微软很快发现，它需要对全世界解释：收购一个完全建立在 Linux操作系统上的搜索引擎，它如何与Windows实现兼容？显然，彼时微软并不能轻易解决这个问题。2003年9月，对资本市场冷淡的谷歌一反常态地默认了上市计划，使它迅速成为人们预期中2004年资本市场的一颗明星。 www.6park.com

　　一个合乎逻辑的推论：一家已经取得了不错成就、且展现出蓬勃生命力的公司，怎么可能不进军这个行业的中心地带呢？如果谷歌开发一款免费、简单、开放的操作系统，并把大量软件和网络应用捆绑在上面，它对微软的杀伤力是可想而知的。 www.6park.com

　　微软迅速拉响了红色警报。2004年1月，在世界经济论坛上，盖茨首度公开表达他对谷歌崛起的忧虑：“谷歌射中了我们的靶心”。盖茨虽未明言“靶心”即是操作系统，但忧虑是前所未有的。 www.6park.com

　　相应的，谷歌的暧昧态度耐人寻味。它从不承认自己要开发一款开放给公众使用的操作系统，但它常用的说法，比如“我们只考虑用户的需求”，也太像欲盖弥彰。 www.6park.com

　　问题是，谷歌要做什么？它对外的说法是，公司的目标是“让全世界的网络用户随时随地获取所有所需信息”，谷歌似乎有明确的战略，但一向以阳光面孔示人的谷歌对此讳莫如深。一直到2006年，创始人之一布林还在表示：“我们尝试很多事情，因为无法确认什么是下一阶段最受欢迎的产品，所以我们尝试很多帮用户解决问题的方式”。而另一个创始人佩奇更为实在：“我们不谈战略，我宁可让人们认为我们晕头转向了，也不愿意知道我的竞争对手们知道我们的方向”。 www.6park.com

　　显然，这是从历史中习得的经验。当你被比尔?盖茨视为竞争对手，无论你是否会染指操作系统，你都危险了。微软并非一家小心翼翼守护自己地盘的公司，它的扩张性才是真正可怕的：任何一个被充分看清的巨大市场都可能成为微软的下一战。 www.6park.com

　　因此，这家在外界看来“野蛮生长”的谷歌，内部进行着极其周密的战略部署。 www.6park.com

　　首先，谷歌要使微软仍认为操作系统是“皇冠上的珠宝”。小心隐藏着战略的同时，谷歌也用各种渠道释放了一些烟雾：它内部开发了名为Goobuntu的操作系统，或者，谷歌将和沃尔玛推出一款使用谷歌操作系统的超低价电脑…… www.6park.com

　　这一“声东击西”之计的核心是：谷歌找到了自己的战场。搜索引擎和网络广告两块业务让它看到了微软没看到的东西，这包括中小企业的营销需求隐藏着一个数百亿美元的巨大市场，更关键的是，谷歌已经多少看到了信息网络化的可能性。换言之，未来一切的计算能力、信息应用和服务，都将通过互联网免费地提供给用户。对于微软来说，这才是真正颠覆性的：互联网颠覆桌面，免费的网络应用颠覆收费的软件。 www.6park.com

　　同时，谷歌极为富有远见地积累着各种长期竞争力。在绝大多数公司将搜索引擎视为算法问题时，Google已经意识到这是一个硬件问题，即存储整个互联网的成本和能力，决定着搜索引擎公司的长期竞争力。因此，它大量招募硬件人才，并成为了全球每年服务器产量最大的公司。它还开发了大量的应用程序，比如谷歌文件系统（Google File System），实现廉价、庞大、高容错，高性能的存储。还有可以让一个程序并发的跑在数万台电脑上的程序框架Mapreduce，可以让任何年轻工程师立即操控数万台电脑一晚上处理到庞大的数据计算，以及巨大的存储结构Bigtable。

　拖延与反击 www.6park.com

　　2005年3月初，在微软工作了16年的微软.Net My Services的首席工程师马克?拉克斯基（Mark Lucovsky）悄然从微软离职，并加盟谷歌。要知道，拉克斯基是Windows核心设计师之一，它曾主要负责Windows NT执行、内核、Win32 runtime以及其它关键部分的设计。拉克斯基的“叛逃”让鲍尔默在办公室内将椅子狠狠地摔在墙上。 www.6park.com

　　彼时，谷歌是否会开发操作系统已经是个次要问题，首要问题已经变成了谷歌触及了微软一直引以为傲的人才池。盖茨曾经有个观点，微软最大的竞争对手是投资银行高盛，因为它才真正分流了本应属于自己的人才。2005年时，似乎谷歌也成为了一个显而易见的人才争夺机器了，它天价且不停增长的股票，它那闻名全球的福利体系，以及它时刻流露出的酷劲都太像另一个微软了。 www.6park.com

　　微软开始酝酿重大调整。2005年11月1日，微软在旧金山启动了“Live”战略。比尔?盖茨声称，现在是“Live”软件的纪元，在促成新一代计算机的诞生方面，微软将扮演重要角色。微软制定计划开发两类网络服务，针对消费者的Windows Live和针对中小企业的Office Live，两者分别是微软Windows和Office软件的网络版。善联想者不免追溯到10年前，1995年12月7日，盖茨启动“互联网战略日”宣告网络浪潮来临，推出Internet Explorer 浏览器与Netscape竞争并最终击败对手。10年以来，这应该是微软在互联网领域最重要的战略了。 www.6park.com

　　负责深化“Live”战略的，是2005年刚加入微软的雷?奥兹（Ray Ozzie），2006年6月，他接替了盖茨首席架构师的职位。一个月后，在微软与分析师的沟通会上，奥兹发表了“PC时代即将终结”的惊世言论：“上一个时代，是个人计算机时代，微软自然会从PC的思考角度出发。但现在我们进入了新时代，一个以互联网为中心的时代”。显然，奥兹已经切中了谷歌未来战略的要害：推动信息产业和科技成长的，不再是任何一种固态的装置，更是互联网。 www.6park.com

　　微软改头换面了？似乎如此。人们不会忘了10年前微软那场“互联网革命”，正是由于Windows既得利益的阻碍，最终导致盖茨对互联网领域的 “入侵”仅以Internet Explorer的胜利宣告结束，并裹足不前。这次，盖茨宣称微软进入了“Live”时代，似乎有了比MSN更完整的互联网战略，但是，致命的事实是， “Windows Live”的名称本身就显示了：微软的互联网战略仍然是捆绑Windows和互联网。正如微软CEO鲍尔默接受《环球企业家》专访时不断重复的,微软的互联网战略，“我们仍然要强调Windows，Windows，Windows”。 www.6park.com

　　而在2006年底接受本刊专访时，微软首席研究师克瑞格?蒙迪也特别强调说：“微软和Google、雅虎这些公司的差别是，遗传上以及既有资产上，Windows、Windows CE和服务器产品都是平台提供者。” www.6park.com

　　即使是坚信PC时代即将终结的奥兹，也不得不顾及盖茨和鲍尔默们对Windows的本能依赖。尽管用户已经可以在Live.com网站上创建网页、搜索信息、定制新闻和在线交流，微软也尽量通过用户免费并依靠广告收入，取代向用户收费的商业模式。但当被问及微软为什么没有提供在线文字处理等应用软件时，奥兹犹疑地回答微软“在看有没有这个可能性，这将分流传统Office产品的销量”。但奥兹认为，微软还是会冒一些风险。尽管，比尔?盖茨和微软最担心的是：这个面向软件开发商而创建的以网络为基础的新操作系统会威胁到Windows操作系统。 www.6park.com

　　把微软完全“互联网化”的执行力，仍是困扰这个庞大软件帝国的难题：组建新的广告和搜索平台后，组织架构的调整从未间断，2006年9月微软内部重新调整组织架构后，广告和搜索被置于放在了平台和服务集团（PSG）里面，其中的“平台”即是指Windows桌面和服务器操作系统平台，这就意味着，微软的互联网战略和战术，仍然都严重依赖Windows的资源。 www.6park.com

　　而Windows Live相关业务，在平台和服务集团内部又分别由在线服务和Windows商业集团，广告和出版解决方案集团，以及搜索、门户和广告集团三个子集团负责，子集团间人员不断更迭，从名称上亦可看出，彼此的界限反复定义不清，角色和功能自然不能清晰。更混乱的是，Windows Live的研发又归属于另外一个子集团。这意味着事实上，Windows live想借助Windows的既有优势，竟需要跨部门调用资源，甚至将搜索引擎和广告结合起来，也需要跨部门的疏通。 www.6park.com

　　幸好，谷歌的进攻显得并不那么凶猛。2007年之前，除了搜索引擎，它真正颠覆性的作品只有两款：2004年愚人节推出的2G大的电子邮箱 Gmail，和2005年的地图系统Google Earth。此后，它只是推出了无数小产品。其中唯一值得微软重视的是类似Office产品线的作品：在线日历Google Calendar、工具和网页版试算表Google Docs+ Google Spreadsheets，终于进入微软最为敏感和畏葸不前的地带。不过，一如以往，谷歌仍始终不承认这是对微软的进攻。 www.6park.com

　　对决时刻 www.6park.com

　　这一次，谷歌也没有撒谎，即使在2006年底将自己的全部在线办公和应用软件打包为“Google Apps”，但这仍算不上真正的进攻，它们远非谷歌的利器。 www.6park.com

　　而微软却无法安之若素。即使全面拥抱Live也念念不忘Windows的微软始终认为“互联网为中心”不过意味着把操作系统搬到网络上，谷歌 Apps的推出使他们仍然在恐惧并等待那个属于谷歌的操作系统。另一面，“Windows Live”并没有带给微软互联网业务更好的时光：2005年2月MSN搜索业务尚且占据整个搜索领域市场份额的14%，两年后，重新打造的Windows Live Search的市场份额却只有9.6%。 www.6park.com

　　但到了2007年初，微软发现谷歌的操作系统越来越具象了，不是坊间流传的那些拼图，而是一个真实存在的东西。但它似乎在另一个战场上：移动设备。 www.6park.com

　　继苹果宣布推出iPhone之后，谷歌破天荒地对外表示，它们同样希望开发类似的东西。起初，它被一些谷歌的拥趸称做“Gphone”。然而施密特称，谷歌和Apple正在“联手做越来越多的事情。我们拥有相同的目标，相同的竞争对手”，于是，它被普遍理解为一款针对微软的移动操作系统。更多迹象能证明这一点，甚至追溯到两年前：2005年5月，谷歌收购为移动设备提供社会化网络软件的Dodgeball，8月收购移动设备软件提供商 Android.，还有一家小公司：提供便携式图形引擎的Skia。 www.6park.com

　　微软会本能地护紧Windows Mobile，属于微软的移动操作系统。2007年2月刚刚发布的Windows Mobile6.0已经开始支持HTML格式邮件的接收和移动版Windows Live Messenger的文件传输，看上去，它越来越看重互联网的作用了。另一方面，WindowsLive数百亿美元的投入换来的是颗粒无收，使微软不得不将更多的赌注置于在线广告之上。在2007年3月谷歌以31亿美元收购网络广告公司DoubleClick之后，5月微软以60亿美元收购网络广告公司 aQuantive。 www.6park.com

　　2007年11月，传说中的谷歌“操作系统”浮出水面。但它并非一款通常意义上的操作系统，谷歌推出“Android”平台，并联合30多家电信运营商、设备商、终端制造商和芯片商成立了“开放手机联盟”，它是一个真正意义上的开放性移动设备综合平台，包括操作系统、用户界面和应用程序。换而言之，它拥有移动电话工作所需的全部软件，任何参与者都能在开放的环境中开发与共享应用并从中获益。用谷歌的话说，它能创造更多“不可知的服务”。 www.6park.com

　　这意味着：开发移动操作系统的成本降至最低，它成为移动通信生态链中最廉价的环节，甚至免费，更甚至手机已经不需要一个像Windows Mobile那样的操作系统。而谷歌，它仍可以靠收取Android平台的广告费来获得收入。显然，数年之间微软对此毫无准备。你似乎可预见 Windows Mobile的未来了，某种程度上，它是谷歌第一次正式的对决，出其不意，具有毁灭性。 www.6park.com

　　谷歌同时发起了第二次真正的攻击，武器是“云计算”。 www.6park.com

　　2007年12月，谷歌CEO施密特揭晓了曾经迷惑微软和很多用户的答案：谷歌Apps与操作系统无关，它不过是谷歌通过互联网向用户提供更大计算能力的一个步骤。对于大多数人而言，计算是复杂而不可靠的。如果谷歌能够通过Web提供计算服务，将是一次真正的体验改进。其实，这种被称作“云计算”的技术是谷歌一切计算和服务的基础。你终于知道谷歌为什么要在数年之前即准备如此强大的后台能力了，谷歌数据库即是“云存储”，谷歌的搜索引擎就是云计算初期的服务产品……这种计算模式下，计算业务将不再局限于个人桌面和企业计算中心，而可以成为一种依托于互联网处理的服务。 www.6park.com

　　对微软来说，这是真正的宣战书，它解构了一切需要安装的软件，尤其是Windows。如果真的像谷歌的预言那样，一切计算、应用和服务都将发生在遥远的数据中心中的服务器，你可以通过台式机、笔记本电脑、手机和PDA访问这些服务的话，Windows还会在哪里呢？

Sunday, March 9, 2008

Associative Container - MAP

http://www.cprogramming.com/tutorial/stl/stlmap.html

STL Maps -- Associative Arrays
Suppose that you're working with some data that has values associated with strings -- for instance, you might have student usernames and you want to assign them grades. How would you go about storing this in C++? One option would be to write your own hash table. This will require writing a hash function and handling collisions, and lots of testing to make sure you got it right. On the other hand, the standard template library (STL) includes a templated class to handle just this sort of situation: the STL map class, which conceptually you can think of as an "associative array" -- key names are associated with particular values (e.g., you might use a student name as a key, and the student's grade as the data).

In fact, the STL's map class allows you to store data by any type of key instead of simply by a numerical key, the way you must access an array or vector. So instead of having to compute a hash function and then access an array, you can just let the map class do it for you.

To use the map class, you will need to include and maps are part of the std namespace. Maps require two, and possibly three, types for the template:

std::map

Notice that the comparison function is in brackets, indicating that it is optional so long as your key_type has the less-than operator, <, defined -- so you don't need to specify a comparision function for so-called primitive types such as int, char, or even for the string class. Moreover, if you overloaded the < char=""> grade_list;

Now, to actually store or access data in the map, all you need to do is treat it like a normal array and use brackets. The only difference is that you no longer use integers for the index -- you use whatever data type was used for the key.

For instance, following the previous example, if we wanted to assign a grade of 'B' to a student named "John", we could do the following:

grade_list["John"] = 'B';

If we later decided that John's grades had improved, we could change his grade in the same fashion:

grade_list["John"] = 'B';
// John's grade improves
grade_list["John"] = 'A';

So adding keys to a map can be done without doing anything special -- we just need to use the key and it will be added automatically along with the corresponding data item. On the other hand, getting rid of an element requires calling the function erase, which is a member of the map class:

erase(key_type key_value);

For instance, to erase "John" from our map, we could do the following:

grade_list.erase("John");

If we're curious, we could also check to see how many values the map contains by using the size function, which is also a member of the map class and other containers:

int size();

For instance, to find the size of our hypothetical class, we could call the size function on the grade list:

std::cout<<"The class is size "<< grade_list;
grade_list["John"] = 'A';
if(grade_list.find("Tim") == grade_list.end())
{
std::cout<<"Tim is not in the map!"<::iterator iterator_name;

where parameters are the parameters used for the container class this iterator will be used for. This iterator can be used to iterate in sequence over the keys in the container. Ah, you ask, but how do I know the key stored in the container? Or, even better, what exactly does it mean to dereference an iterator over the map container? The answer is that when you dereference an iterator over the map class, you get a "pair", which essentially has two members, first and second. First corresponds to the key, second to the value. Note that because an iterator is treated like a pointer, to access the member variables, you need to use the arrow operator, ->, to "dereference" the iterator.

For instance, the following sample shows the use of an iterator (pointing to the beginning of a map) to access the key and value.

std::map grade_list;
grade_list["John"] = 'A';
// Should be John
std::cout<first<<second<

Finally, you might wonder what the cost of using a map is. One issue to keep in mind is that insertion of a new key (and associated value) in a map, or lookup of the data associated with a key in a map, can take up to O(log(n)) time, where n is the size of the current map. This is potentially a bit slower than some hash tables with a good hashing function, and is due to the fact that the map keys are stored in sorted order for use by iterators.

Summary

The Good

* Maps provide a way of using "associative arrays" that allow you to store data indexed by keys of any type you desire, instead of just integers as with arrays.
* Maps can be accessed using iterators that are essentially pointers to templated objects of base type pair, which has two members, first and second. First corresponds to the key, second to the value associated with the key.
* Maps are fast, guaranteeing O(log(n)) insertion and lookup time.

The Gotchas

* Checking for whether an element is in a map requires using the find function and comparing the returned iterator with the iterator associated with the end of the map.
* Since maps associate a key with only one value, you need to use a multimap (or make the type of the data a container) if you must associate multiple pieces of data with one key.

call stack

gdb debugger

1. compile with flag -g

prompt > gcc -g program.c -o programname

2. start debugger

prompt> gdb programname

3. run

(gdb) run arg1 "arg2" ...

4. kill to stop execution and run to restart
(gdb) kill
(gdb) run

5. quit
(gdb) quit

Execution
(gdb) list
(gdb) next
(gdb) step
(gdb) print variable_name
(gdb) set variable_name = value
(gdb) finish (return from a function)

Call Stack
(gdb) backtrace
(gdb) frame #
(gdb) info frame
(gdb) info locals
(gdb) info args

Breakpoints
(gdb) break line
(gdb) break functionname
(gdb) break classname::func(int)
(gdb) info breakpoints
(gdb) disable 2(the number of the breakpoint showed with info breakpoints)
(gdb) ignore break_number times

http://www.unknownroad.com/rtfm/gdbtut/gdbtoc.html

Saturday, March 8, 2008

hash_map

using hash_map in g++:

#include <.ext/hash_map.>

using namespace __gnu_cxx;

Friday, March 7, 2008

Symbol Tables

Chaining of symbol tables results in a tree structure.

Wednesday, March 5, 2008

Standard Template Library

Race conditions

Tuesday, March 4, 2008

C FAQ

Array Initialization
external and static array will be initialized to zero by default. automatic array should be initialized manually. The method to initialized automatic array e.g. int a[100] = {0} (other unset elements will be set to zero).

strlen Vs sizeof
1) when applied to a pointer
strelen will renturn the number of the characters in the null-terminated string while sizeof will return the pointer size in bytes, which will probably be 4 bytes.
2) when applied to an array name
both return the size of the array.

NULL character
all bits are set to 0 and thus have a numeric value of 0, presented in the ASCII character set. null character can also be represented as '\0' in the source code.
Question: Can you input a null character in a disk file?

EOF
symbolic value (macro) EOF for end-of-file. no more data can be read from a data source. The data source is usually called a file or stream. The actual value of EOF is a system-dependent negative number, commonly -1, which is guaranteed to be unequal to any valid character code.
EOF is one of the control character or non-print character (which means itself does not represent a written symbol). in Unix, ctrl + d, in Windows, ctrl+z

getline Vs fgets Vs scanf

char * fgets (char *s, int count, FILE *stream)
The fgets function reads characters from the stream stream up to and including a newline character and stores them in the string s, adding a null character to mark the end of the string.

http://www.gnu.org/software/libc/manual/html_mono/libc.html

X Window System

X originated at MIT in 1984. The current protocol version, X11, appeared in September 1987.

X Window System (commonly X11 or X) is not an integral part of the operating system; instead, it is built as an additional application layer on top of the operating system kernel.

Unlike previous display protocols, X was specifically designed to be used over network connections rather than on an integral or attached display device. X features network transparency: the machine where an application program (the client application) runs can differ from the user's local machine (the display server).

This client-server terminology — the user's terminal as the "server", the remote or local applications as the "clients" — often confuses new X users, because the terms appear reversed. But X takes the perspective of the program, rather than that of the end-user or of the hardware: the local X display provides display services to programs, so it acts as a server; any remote program uses these services, thus it acts as a client.

What is the X Window System?

[ScheiflerGettys92]

The X Window System, or X, is a network-transparent window system. With X, multiple applications can run simultaneously in windows, generating text and graphics in monochrome or color on a bitmap display. Network transparency means that application programs can run on machines scattered through the network.

What is an X Server?

An X Server is a program that provides display and user input services to other programs. In comparison, a file server provides other programs with access to file storage devices. File servers are typically located in a remote location and you use the services of a file server from the machine that you are located at. In contrast, an X Server is typically running on the machine that you are located at; display and user input services may be requested by programs running on your machine, as well as by programs running on remote machines.

What is an X client?

An X client is a program that utilizes the display and user input services provided by an X Server. X clients may run on the same or disparate machine as the X Server that is providing display and user input services.

ssh -X qiang@129.49.32.81

You will need to have the X-server running on your local computer and then use ssh -X option to connect to the remote server.

Reference:

http://www.linux-tip.net/cms/content/view/302/26/

http://en.wikipedia.org/wiki/X_Window_System

http://x.cygwin.com/docs/faq/cygwin-x-faq.html#q-what-is-cygwin-x

狭路相逢勇者胜