Knowns and Unknowns: August 2010

Tuesday, August 31, 2010

Installing & Running Scribe with HDFS support on CentOS

Have being working on Scribe for our logging for a while. Finally feel like writing something about it. The main trigger came from a request to support HDFS. Since I always like to tackle open source project's incompatibilities on different environments, I made this challenge the highest priority (although my boss probably doesn't think so...)

Installing Scribe on CentOS 5.3 64-bit

1. Java SE Development Kit (JDK) 6 latest update - http://www.oracle.com/technetwork/java/javase/downloads/index.html. I used update 20. The java directory is: /usr/java/jdk1.6.0_20.

2. ruby-1.8.5-5.el5_4.8 + ruby-devel-1.8.5-5.el5_4.8 (using yum)

3. python-2.4.3-24.el5 + python-devel-2.4.3-24.el5.x86_64 (using yum)

4. libevent + libevent-devel - libevent-1.4.13-1/libevent-devel-1.4.13-1 (using yum)

5. gcc-c++-4.1.2-46.el5_4.2

6. boost 1.40 - http://downloads.sourceforge.net/project/boost/boost/1.40.0/boost_1_40_0.tar.gz?use_mirror=softlayer

[user@localhost] ./bootstrap.sh

[user@localhost] ./bjam

[user@localhost]  sudo su -

[root@localhost] ./bjam install

7. flex-2.5.4a-41.fc6 (using yum)

8. m4-1.4.15 - ftp.gnu.org/gnu/m4 (do not use the version from yum)

9. imake-1.0.2-3.x86_64 (using yum)

10. autoconf-2.65 - ftp.gnu.org/gnu/autoconf (do not use the version from yum)

11. automake-1.11.1 - ftp.gnu.org/gnu/automake (do not use the version from yum)

12. libtool-2.2.6b - ftp.gnu.org/gnu/libtool (do not use the version from yum)

13. bison-2.3-2.1 (using yum). It actually needs yacc. The following is to make yacc script that actually calls bison.

[root@localhost] more /usr/bin/yacc



#!/bin/sh

exec bison -y "$@"



[root@localhost] chmod +x /usr/bin/yacc

14. Latest version: http://incubator.apache.org/thrift/download
thrift-0.2.0 - http://archive.apache.org/dist/incubator/thrift/0.2.0-incubating
thrift-0.4.0 - http://archive.apache.org/dist/incubator/thrift/0.4.0-incubating

I am using thrift-0.2.0 in this example.

[user@localhost] ./bootstrap.sh
[user@localhost] ./configure

If you see this:
error: ./configure: line 21183: syntax error near unexpected token `MONO,'

Copy pkg.m4 in /usr/share/aclocal to thrift's aclocal directory. From the top-level thrift directory, do the following:

cp /usr/share/aclocal/pkg.m4 aclocal

Then again:

[user@localhost] ./bootstrap.sh

[user@localhost] ./configure

[user@localhost] make

[user@localhost] sudo su -

[root@localhost] make install

[root@localhost] exit

You may see the following error when building thrift 0.4.0 and 0.5.0.

make[4]: Entering directory `/home/user/pkgs/thrift-0.4.0/lib/cpp'
/bin/sh ../../libtool --tag=CXX --mode=compile g++ -DHAVE_CONFIG_H -I. -I../.. -I/usr/local/include -I./src -Wall -g -O2 -MT ThreadManager.lo -MD -MP -MF .deps/ThreadManager.Tpo -c -o ThreadManager.lo `test -f 'src/concurrency/ThreadManager.cpp' || echo './'`src/concurrency/ThreadManager.cpp
libtool: compile: g++ -DHAVE_CONFIG_H -I. -I../.. -I/usr/local/include -I./src -Wall -g -O2 -MT ThreadManager.lo -MD -MP -MF .deps/ThreadManager.Tpo -c src/concurrency/ThreadManager.cpp -fPIC -DPIC -o .libs/ThreadManager.o In file included from src/concurrency/ThreadManager.cpp:20:
src/concurrency/ThreadManager.h:24:26: tr1/functional: No such file or directoryIn file included from src/concurrency/ThreadManager.cpp:20:

Please change line 24 of ThreadManager.h from

#include <tr1/functional>

#include <boost/tr1/tr1/functional>

We also need to compile and install the Facebook fb303 library. From the top-level thrift directory:

[user@localhost] cd contrib/fb303

[user@localhost] ./bootstrap.sh

[user@localhost] ./configure

[user@localhost] make

[user@localhost] sudo su -

[root@localhost] make install

[root@localhost] exit

15. hadoop 0.21.0 - http://www.apache.org/dyn/closer.cgi/hadoop/core/

[user@localhost] cd hadoop-0.21.0/hdfs/src/c++/libhdfs

[user@localhost] ./configure JVM_ARCH=tune=k8 --with-java=/usr/java/jdk1.6.0_20

[user@localhost] make

[user@localhost] sudo su -

[root@localhost] cp .libs/libhdfs.so .libs/libhdfs.so.0 /usr/local/include 

[root@localhost] cp hdfs.h /usr/local/include 

[root@localhost] exit

16. scribe-2.2 - http://github.com/downloads/facebook/scribe/scribe-2.2.tar.gz - have to use scribe-2.1 or above to support HDFS.

[user@localhost] ./bootstrap.sh --enable-hdfs

[user@localhost] ./configure

[user@localhost] make

17. Configure and Run Hadoop (single-node cluster in this tutorial).

a. First we need to modify a few configuration files. From top-level hadoop directory, edit conf/hadoop-env.sh, conf/core-site.xml, conf/hdfs-site.xml and conf/macoded-site.xml files.

[user@localhost] more conf/hadoop-env.sh



export HADOOP_OPTS=-Djava.net.codeferIPv4Stack=true



# Set Hadoop-specific environment variables here.



# The only required environment variable is JAVA_HOME.  All others are

# optional.  When running a distributed configuration it is best to

# set JAVA_HOME in this file, so that it is correctly defined on

# remote nodes.



# The java implementation to use.  Required.

export JAVA_HOME=/usr/java/jdk1.6.0_20

.

.

.



[user@localhost] more conf/core-site.xml



<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>



<!-- Put site-specific property overrides in this file. -->



<configuration>

<property>

  <name>hadoop.tmp.dir</name>

  <value>/tmp/hadoop</value>

  <description>A base for other temporary directories.</description>

</property>



<property>

  <name>fs.default.name</name>

  <value>hdfs://localhost:9000</value>

  <description>The name of the default file system.  A URI whose

  scheme and authority determine the FileSystem implementation.  The

  uri's scheme determines the config property (fs.SCHEME.impl) naming

  the FileSystem implementation class.  The uri's authority is used to

  determine the host, port, etc. for a filesystem.</description>

</property>

</configuration>



[user@localhost] more conf/hdfs-site.xml



<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>



<!-- Put site-specific property overrides in this file. -->



<configuration>

<property>

  <name>dfs.replication</name>

  <value>1</value>

  <description>Default block replication.

  The actual number of replications can be specified when the file is created.

  The default is used if replication is not specified in create time.

  </description>

</property>

</configuration>



[user@localhost] more conf/macoded-site.xml



<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>



<!-- Put site-specific property overrides in this file. -->



<configuration>

<property>

  <name>macoded.job.tracker</name>

  <value>localhost:9001</value>

  <description>The host and port that the Macodeduce job tracker runs

  at.  If "local", then jobs are run in-process as a single map

  and reduce task.

  </description>

</property>

</configuration>

b. Format the namenode. From top-level hadoop directory

[user@localhost] bin/hadoop namenode -format

c. Start hadoop

[user@localhost] start-all.sh

d. Use jps to check if all the processes are started.

[user@localhost] jps

25362 JobTracker

24939 NameNode

25099 DataNode

25506 TaskTracker

25251 SecondaryNameNode

25553 Jps

e. Use netstat to check if port 9000 (set in core-site.xml) is listening.

[user@localhost] sudo netstat -nap | grep 9000

sudo netstat -nap | grep 9000

tcp        0      0 127.0.0.1:9000              0.0.0.0:*                   LISTEN      24939/java         

tcp        0      0 127.0.0.1:9000              127.0.0.1:59957             ESTABLISHED 24939/java         

tcp        0      0 127.0.0.1:9000              127.0.0.1:59960             ESTABLISHED 24939/java         

tcp        0      0 127.0.0.1:59957             127.0.0.1:9000              ESTABLISHED 25099/java         

tcp        0      0 127.0.0.1:59960             127.0.0.1:9000              ESTABLISHED 25251/java

f. Open a browser and type server_ip:50070 to check if it shows 1 Live Nodes in the Cluster Summary. Be patient, sometimes it takes some time (30 seconds to 1 minute) to show. Remember to put the cursor to the address bar and codess enter to refresh the page. For some reason, F5 (Reload) doesn't work for me.

18. Configure and Run Scribe
a. Set up the java class paths for Scribe. I have installed my hadoop 0.21.0 in ~/pkgs/hadoop-0.21.0 directory

[user@localhost] export CLASSPATH=~/pkgs/hadoop-0.21.0/hadoop-hdfs-0.21.0.jar:~/pkgs/hadoop-0.21.0/lib/commons-logging-1.1.1.jar:~/pkgs/hadoop-0.21.0/hadoop-common-0.21.0.jar

b. Edit a Scribe configuration file to use HDFS. Change to scribe src directory

[user@localhost] more scribe_hdfs.conf

port=1463

max_msg_per_second=2000000

check_interval=3



# DEFAULT

<store>

category=default

type=buffer



target_write_size=20480

max_write_interval=1

buffer_send_rate=2

retry_interval=30

retry_interval_range=10



<primary>

type=file

fs_type=hdfs

file_path=hdfs://localhost:9000/scribetest

create_symlink=no

use_hostname_sub_directory=yes

base_filename=thisisoverwritten

max_size=40000000

rotate_period=daily

rotate_hour=0

rotate_minute=5

add_newlines=1

</primary>

<secondary>

type=file

fs_type=std

file_path=/tmp/scribetest

base_filename=thisisoverwritten

max_size=3000000

</secondary>

</store>

c. Run Scribe

[user@localhost] scribed scribe_hdfs.conf

[Thu Sep  9 15:35:22 2010] "setrlimit error (setting max fd size)"

[Thu Sep  9 15:35:22 2010] "STATUS: STARTING"

[Thu Sep  9 15:35:22 2010] "STATUS: configuring"

[Thu Sep  9 15:35:22 2010] "got configuration data from file <scribe_hdfs.conf>"

[Thu Sep  9 15:35:22 2010] "CATEGORY : default"

[Thu Sep  9 15:35:22 2010] "Creating default store"

[Thu Sep  9 15:35:22 2010] "configured <1> stores"

[Thu Sep  9 15:35:22 2010] "STATUS: "

[Thu Sep  9 15:35:22 2010] "STATUS: ALIVE"

[Thu Sep  9 15:35:22 2010] "Starting scribe server on port 1463"

Thrift: Thu Sep  9 15:35:22 2010 libevent 1.4.13-stable method epoll

If it cannot run and complains the following:
libboost_system.so.1.40.0: cannot open shared object

Do the following:

[user@localhost] echo '/usr/local/lib/' >> /etc/ld.so.conf.d/my_boost.conf

/sbin/ldconfig -v

If you see the following output when running scribe:

[user@localhost] scribed scribe_hdfs.conf

[Thu Sep  9 15:39:38 2010] "setrlimit error (setting max fd size)"

[Thu Sep  9 15:39:38 2010] "STATUS: STARTING"

[Thu Sep  9 15:39:38 2010] "STATUS: configuring"

[Thu Sep  9 15:39:38 2010] "got configuration data from file <scribe_hdfs.conf>"

[Thu Sep  9 15:39:38 2010] "CATEGORY : default"

[Thu Sep  9 15:39:38 2010] "Creating default store"

[Thu Sep  9 15:39:38 2010] "configured <1> stores"

[Thu Sep  9 15:39:38 2010] "STATUS: "

[Thu Sep  9 15:39:38 2010] "STATUS: ALIVE"

[Thu Sep  9 15:39:38 2010] "Starting scribe server on port 1463"

[Thu Sep  9 15:39:38 2010] "Exception in main: TNonblockingServer::serve() bind"

[Thu Sep  9 15:39:38 2010] "scribe server exiting"

It may be due to Port 1463 not being available. Run "netstat -nap | grep 1463" to find out which program is using it.

19. Send something to Scribe to log in HDFS
From a different terminal, in top-level Scribe directory:

[user@localhost] echo "hello world" | examples/scribe_cat test

In the terminal that runs Scribe server, you should see the following output:

[Thu Sep  9 15:46:14 2010] "[test] Creating new category from model default"

[Thu Sep  9 15:46:14 2010] "store thread starting"

[Thu Sep  9 15:46:14 2010] "[hdfs] Connecting to HDFS"

[Thu Sep  9 15:46:14 2010] "[hdfs] Before hdfsConnectNewInstance(localhost, 9000)"

Sep 9, 2010 3:46:14 PM org.apache.hadoop.security.Groups 

INFO: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000

[Thu Sep  9 15:46:15 2010] "[hdfs] After hdfsConnectNewInstance"

[Thu Sep  9 15:46:15 2010] "[hdfs] Connecting to HDFS"

[Thu Sep  9 15:46:15 2010] "[hdfs] Before hdfsConnectNewInstance(localhost, 9000)"

[Thu Sep  9 15:46:15 2010] "[hdfs] After hdfsConnectNewInstance"

[Thu Sep  9 15:46:15 2010] "[hdfs] opened for write hdfs://localhost:9000/scribetest/test/localhost.localdomain/test-2010-09-09_00000"

[Thu Sep  9 15:46:15 2010] "[test] Opened file  for writing"

[Thu Sep  9 15:46:15 2010] "[test] Opened file  for writing"

[Thu Sep  9 15:46:15 2010] "[test] Changing state from  to "

Opening Primary

[Thu Sep  9 15:46:15 2010] "[test] successfully read <0> entries from file "

[Thu Sep  9 15:46:15 2010] "[test] No more buffer files to send, switching to streaming mode"

[Thu Sep  9 15:46:15 2010] "[test] Changing state from  to "

20. Check if the message has been logged to HDFS:
Stop Scribe first (either stop Scribe or make the file rotate, otherwise Hadoop won't write it to the filesystem.)

[user@localhost] hadoop fs -lsr /

10/09/09 16:26:02 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000

10/09/09 16:26:03 WARN conf.Configuration: macoded.task.id is decodecated. Instead, use macodeduce.task.attempt.id

drwxr-xr-x   - user supergroup          0 2010-09-09 16:21 /jobtracker

drwxr-xr-x   - user supergroup          0 2010-09-09 16:21 /jobtracker/jobsInfo

drwxr-xr-x   - user supergroup          0 2010-09-09 16:23 /scribetest

drwxr-xr-x   - user supergroup          0 2010-09-09 16:23 /scribetest/test

drwxr-xr-x   - user supergroup          0 2010-09-09 16:23 /scribetest/test/localhost.localdomain

-rw-r--r--   3 user supergroup         13 2010-09-09 16:25 /scribetest/test/localhost.localdomain/test-2010-09-09_00000

drwxr-xr-x   - user supergroup          0 2010-09-09 16:21 /tmp

drwxr-xr-x   - user supergroup          0 2010-09-09 16:21 /tmp/hadoop

drwxr-xr-x   - user supergroup          0 2010-09-09 16:21 /tmp/hadoop/macoded

drwx------   - user supergroup          0 2010-09-09 16:21 /tmp/hadoop/macoded/system

-rw-------   1 user supergroup          4 2010-09-09 16:21 /tmp/hadoop/macoded/system/jobtracker.info

Get the directory out to take a look:

[user@localhost] hadoop fs -get /scribetest test

10/09/09 16:26:47 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000

10/09/09 16:26:47 WARN conf.Configuration: macoded.task.id is decodecated. Instead, use macodeduce.task.attempt.id



[user@localhost] more test/test/localhost.localdomain/test-2010-09-09_00000

hello world

One note about the Secondary store in Scribe configuration file: Scribe opens the files for both Primary and Secondary stores even in the normal situation as long as replay_buffer is true (default). Then it will try to delete the secondary store file when Primary store is handling the messages. This is causing problem because HDFS has not completed its access to the secondary store file. The following exception will happen:

[Thu Sep  9 16:02:03 2010] "[hdfs] deleteFile hdfs://localhost:9000/scribetest1/test/localhost.localdomain/test_00000"

[Thu Sep  9 16:02:03 2010] "[hdfs] Connecting to HDFS"

[Thu Sep  9 16:02:03 2010] "[hdfs] Before hdfsConnectNewInstance(localhost, 9000)"

[Thu Sep  9 16:02:03 2010] "[hdfs] After hdfsConnectNewInstance"

[Thu Sep  9 16:02:03 2010] "[test] No more buffer files to send, switching to streaming mode"

Exception in thread "main" java.io.IOException: Could not complete write to file /scribetest1/test/localhost.localdomain/test_00000 by DFSClient_1545136365

        at org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:720)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

        at java.lang.reflect.Method.invoke(Method.java:597)

        at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:342)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1350)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1346)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742)

        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1344)



        at org.apache.hadoop.ipc.Client.call(Client.java:905)

        at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198)

        at $Proxy0.complete(Unknown Source)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

        at java.lang.reflect.Method.invoke(Method.java:597)

        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)

        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)

        at $Proxy0.complete(Unknown Source)

        at org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:1406)

        at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1393)

        at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:66)

        at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:91)

Call to org/apache/hadoop/fs/FSDataOutputStream::close failed!

[Thu Sep  9 16:02:03 2010] "[hdfs] closed hdfs://localhost:9000/scribetest1/test/localhost.localdomain/test_00000"

There is more information about this error in "NameNode Logs" link on http://server_ip:50070/dfshealth.jsp page.

To avoid this problem we should either set replay_buffer to false or make the seconardy store local instead of HDFS (the above configuration file example, scribe_hdfs.conf).

The following configuration is to set replay_buffer to false and both primary and secondary stores to use HDFS.

[user@localhost] more hdfs_both.conf

port=1463

max_msg_per_second=2000000

check_interval=3



# DEFAULT

<store>

category=default

type=buffer

replay_buffer=no

target_write_size=20480

max_write_interval=1

buffer_send_rate=2

retry_interval=30

retry_interval_range=10



<primary>

type=file

fs_type=hdfs

file_path=hdfs://localhost:9000/scribetest

create_symlink=no

use_hostname_sub_directory=yes

base_filename=thisisoverwritten

max_size=40000000

rotate_period=daily

rotate_hour=0

rotate_minute=5

add_newlines=1



</primary>

<secondary>

type=file

fs_type=hdfs

file_path=hdfs://localhost:9000/scribetest1

create_symlink=no

use_hostname_sub_directory=yes

base_filename=thisisoverwritten

max_size=40000000

rotate_period=daily

rotate_hour=0

rotate_minute=5

add_newlines=1

</secondary>



</store>

References:
1. Thomas Dudziak's blog: How to install Scribe with HDFS support on Ubuntu Karmic
2. Agile Testing: Compiling, installing and test-running Scribe
3. Google Scribe server group: Failover when writing to HDFS problems

Tuesday, August 17, 2010

Key Computer Architecture Techniques - Pipelining

Pipelining
It is an implementation technique where multiple instructions are overlapped in execution. The computer pipeline is divided in stages. Each stage completes a part of an instruction in parallel. The stages are connected one to the next to form a pipe - instructions enter at one end, progress through the stages, and exit at the other end.

Because the pipe stages are hooked together, all the stages must be ready to proceed at the same time. We call the time required to move an instruction one step further in the pipeline a machine cycle . The length of the machine cycle is determined by the time required for the slowest pipe stage.

For example, the classic RISC pipeline is broken into five stages with a set of flip flops between each stage.

1. Instruction fetch
2. Instruction decode and register fetch
3. Execute
4. Memory access
5. Register write back

Pipelining does not help in all cases. There are several possible disadvantages. An instruction pipeline is said to be fully pipelined if it can accept a new instruction every clock cycle. A pipeline that is not fully pipelined has wait cycles that delay the progress of the pipeline.

Advantages of Pipelining:

1. The cycle time of the processor is reduced, thus increasing instruction issue-rate in most cases.

2. Some combinational circuits such as adders or multipliers can be made faster by adding more circuitry. If pipelining is used instead, it can save circuitry vs. a more complex combinational circuit.

Disadvantages of Pipelining:

1. A non-pipelined processor executes only a single instruction at a time. This prevents branch delays (in effect, every branch is delayed) and problems with serial instructions being executed concurrently. Consequently the design is simpler and cheaper to manufacture.

2. The instruction latency in a non-pipelined processor is slightly lower than in a pipelined equivalent. This is due to the fact that extra flip flops (pipeline registers/buffers) must be added to the data path of a pipelined processor.

3. A non-pipelined processor will have a stable instruction bandwidth. The performance of a pipelined processor is much harder to predict and may vary more widely between different programs.

Friday, August 13, 2010

Computer Performance

From Stanford EE282 + Computer Architecture: A Quantitative Approach, 4th Edition

Latency or execution time or response time

Wall-clock time to complete a task

Bandwidth or throughput or execute rate

Number of tasks completed per unit of time

Latency matric: program execution time

CPUTime = Seconds/Program
                = Cycles/Program * Seconds/Cycle
                = Instructions/Program * Cycles/Instruction * Seconds/Cycle
                = IC * CPI * CCT

IC: Instruction Count
CPI: Clock Cycles Per Instruction
CCT: Clock Cycle Time

Amdahl’s Law

Speedup = Execution time for entire task without using the enhancement/Execution time for entire task using the enhancement when possible

It should be greater than 1 (when there is an improvement, that is)

new_execution_time
= (original_execution_time * (1 - enhanced_fraction)) + original_execution_time * enhanced_fraction * (1 / speedup_enhanced)
= (original_execution_time)((1-enhanced_fraction) + enhanced_fraction/speedup_enhanced)

speedup_overall
= original_execution_time / new_execution_time
= 1/((1-enhanced_fraction) + enhanced_fraction/speedup_enhanced)

In the case of parallelization, Amdahl's law states that if enhanced_fraction is the proportion of a program that can be made parallel (i.e. benefit from parallelization), and (1 − enhanced_fraction) is the proportion that cannot be parallelized (remains serial), then the maximum speedup that can be achieved by using N processors (N times faster for the part that can be enhanced = speedup_enhanced) is

speedup_overall

= 1/((1-enhanced_fraction) + enhanced_fraction/speedup_enhanced)

In the limit, as N tends to infinity, the maximum speedup tends to 1 / (1 − enhanced_fraction).

Friday, August 6, 2010

System Diagram of A Modern Laptop

From Wikipedia Intel X58

Intel X58: Intel X58 Chipset
QPI: The Intel QuickPath Interconnect is a point-to-point processor interconnect developed by Intel to compete with HyperTransport
I/O Controller Hub (ICH), also known as Intel 82801, is an Intel southbridge on motherboards with Intel chipsets (Intel Hub Architecture). As with any other southbridge, the ICH is used to connect and control peripheral devices.
EHCI: The Enhanced Host Controller Interface (EHCI) specification describes the register-level interface for a Host Controller for the Universal Serial Bus (USB) Revision 2.0.
DMI: Direct Media Interface (DMI) is point-to-point interconnection between an Intel northbridge and an Intel southbridge on a computer motherboard. It is the successor of the Hub Interface used in previous chipsets. It provides for a 10Gb/s bidirectional data rate.
LPC: The Low Pin Count bus, or LPC bus, is used on IBM-compatible personal computers to connect low-bandwidth devices to the CPU, such as the boot ROM and the "legacy" I/O devices (behind a super I/O chip).
SPI: The Serial Peripheral Interface Bus or SPI (pronounced "ess-pee-i" or "spy") bus is a synchronous serial data link standard named by Motorola that operates in full duplex mode. Devices communicate in master/slave mode where the master device initiates the data frame.
Intel Matrix Storage Technology: It provides new levels of protection, performance, and expandability for desktop and mobile platforms. Whether using one or multiple hard drives, users can take advantage of enhanced performance and lower power consumption. When using more than one drive, the user can have additional protection against data loss in the event of a hard drive failure.
Intel Turbo Memory with User Pinning: An on-motherboard flash card, Intel's Turbo Memory is designed to act as another layer in the memory hierarchy, caching data where possible and improving performance/battery life in notebooks. User pinning offers more options to the user to improve system applications launch time and responsiveness.

Heap Memory

From Stanford CS Ed Library: Pointers and Memory

"Heap" memory, also known as "dynamic" memory, is an alternative to local stack memory. Local memory is quite automatic — it is allocated automatically on function call and it is deallocated automatically when a function exits. Heap memory is different in every way. The programmer explicitly requests the allocation of a memory "block" of a particular size, and the block continues to be allocated until the programmer
explicitly requests that it be deallocated. Nothing happens automatically. So the programmer has much greater control of memory, but with greater responsibility since the memory must now be actively managed.

The advantages of heap memory are...

Lifetime. Because the programmer now controls exactly when memory is allocated and deallocated, it is possible to build a data structure in memory, and return that data structure to the caller. This was never possible with local memory which was automatically deallocated when the function exited.
Size. The size of allocated memory can be controlled with more detail. For example, a string buffer can be allocated at run-time which is exactly the right size to hold a particular string. With local memory, the code is more likely to declare a buffer size 1000 and hope for the best.

The disadvantages of heap memory are...

More Work. Heap allocation needs to arranged explicitly in the code which is just more work.
More Bugs. Because it's now done explicitly in the code, realistically on occasion the allocation will be done incorrectly leading to memory bugs. Local memory is constrained, but at least it's never wrong.

Nonetheless, there are many problems that can only be solved with heap memory, so that's that way it has to be. In languages with garbage collectors such as Perl, LISP, or Java, the above disadvantages are mostly eliminated. The garbage collector takes over most of the responsibility for heap management at the cost of a little extra time taken at run-time.

Thursday, August 5, 2010

Classic swap function in C and C++

In C:


void Swap(int* a, int* b) 
{
    int temp;
    temp = *a;
    *a = *b;
    *b = temp;
}

void SwapCaller() 
{
    int x = 1;
    int y = 2;
    // Use & to pass pointers to the int values of interest
    Swap(&x, &y); 
}

In C++:


// The & declares pass by reference
void Swap(int& a, int& b) 
{
    int temp;
    // No *'s required -- the compiler takes care of it    
    temp = a;
    a = b;
    b = temp;
}

void SwapCaller() 
{
    int x = 1;
    int y = 2;
    // No &'s required -- the compiler takes care of it
    Swap(x, y); 
}

How Does Call Stack Work?

From Stanford CS Ed Library: Pointers and Memory

The exact details of the implementation are language and compiler specific. However, the basic structure below is approximates the method used by many different systems and languages...

To call a function such as foo(6, x+1)...

1. Evaluate the actual parameter expressions, such as the x+1, in the caller's context.

2. Allocate memory for foo()'s locals by pushing a suitable "local block" of memory onto a runtime "call stack" dedicated to this purpose. For parameters but not local variables, store the values from step (1) into the appropriate slot in foo()'s local block.

3. Store the caller's current address of execution (its "return address") and switch execution to foo().

4. foo() executes with its local block conveniently available at the end of the call stack.

5. When foo() is finished, it exits by popping its locals off the stack and "returns" to the caller using the previously stored return address. Now the caller's locals are on the end of the stack and it can resume executing.