Knowns and Unknowns: Installing & Running Scribe with HDFS support on CentOS

Tuesday, August 31, 2010

Installing & Running Scribe with HDFS support on CentOS

Have being working on Scribe for our logging for a while. Finally feel like writing something about it. The main trigger came from a request to support HDFS. Since I always like to tackle open source project's incompatibilities on different environments, I made this challenge the highest priority (although my boss probably doesn't think so...)

Installing Scribe on CentOS 5.3 64-bit

1. Java SE Development Kit (JDK) 6 latest update - http://www.oracle.com/technetwork/java/javase/downloads/index.html. I used update 20. The java directory is: /usr/java/jdk1.6.0_20.

2. ruby-1.8.5-5.el5_4.8 + ruby-devel-1.8.5-5.el5_4.8 (using yum)

3. python-2.4.3-24.el5 + python-devel-2.4.3-24.el5.x86_64 (using yum)

4. libevent + libevent-devel - libevent-1.4.13-1/libevent-devel-1.4.13-1 (using yum)

5. gcc-c++-4.1.2-46.el5_4.2

6. boost 1.40 - http://downloads.sourceforge.net/project/boost/boost/1.40.0/boost_1_40_0.tar.gz?use_mirror=softlayer

[user@localhost] ./bootstrap.sh

[user@localhost] ./bjam

[user@localhost]  sudo su -

[root@localhost] ./bjam install

7. flex-2.5.4a-41.fc6 (using yum)

8. m4-1.4.15 - ftp.gnu.org/gnu/m4 (do not use the version from yum)

9. imake-1.0.2-3.x86_64 (using yum)

10. autoconf-2.65 - ftp.gnu.org/gnu/autoconf (do not use the version from yum)

11. automake-1.11.1 - ftp.gnu.org/gnu/automake (do not use the version from yum)

12. libtool-2.2.6b - ftp.gnu.org/gnu/libtool (do not use the version from yum)

13. bison-2.3-2.1 (using yum). It actually needs yacc. The following is to make yacc script that actually calls bison.

[root@localhost] more /usr/bin/yacc



#!/bin/sh

exec bison -y "$@"



[root@localhost] chmod +x /usr/bin/yacc

14. Latest version: http://incubator.apache.org/thrift/download
thrift-0.2.0 - http://archive.apache.org/dist/incubator/thrift/0.2.0-incubating
thrift-0.4.0 - http://archive.apache.org/dist/incubator/thrift/0.4.0-incubating

I am using thrift-0.2.0 in this example.

[user@localhost] ./bootstrap.sh
[user@localhost] ./configure

If you see this:
error: ./configure: line 21183: syntax error near unexpected token `MONO,'

Copy pkg.m4 in /usr/share/aclocal to thrift's aclocal directory. From the top-level thrift directory, do the following:

cp /usr/share/aclocal/pkg.m4 aclocal

Then again:

[user@localhost] ./bootstrap.sh

[user@localhost] ./configure

[user@localhost] make

[user@localhost] sudo su -

[root@localhost] make install

[root@localhost] exit

You may see the following error when building thrift 0.4.0 and 0.5.0.

make[4]: Entering directory `/home/user/pkgs/thrift-0.4.0/lib/cpp'
/bin/sh ../../libtool --tag=CXX --mode=compile g++ -DHAVE_CONFIG_H -I. -I../.. -I/usr/local/include -I./src -Wall -g -O2 -MT ThreadManager.lo -MD -MP -MF .deps/ThreadManager.Tpo -c -o ThreadManager.lo `test -f 'src/concurrency/ThreadManager.cpp' || echo './'`src/concurrency/ThreadManager.cpp
libtool: compile: g++ -DHAVE_CONFIG_H -I. -I../.. -I/usr/local/include -I./src -Wall -g -O2 -MT ThreadManager.lo -MD -MP -MF .deps/ThreadManager.Tpo -c src/concurrency/ThreadManager.cpp -fPIC -DPIC -o .libs/ThreadManager.o In file included from src/concurrency/ThreadManager.cpp:20:
src/concurrency/ThreadManager.h:24:26: tr1/functional: No such file or directoryIn file included from src/concurrency/ThreadManager.cpp:20:

Please change line 24 of ThreadManager.h from

#include <tr1/functional>

#include <boost/tr1/tr1/functional>

We also need to compile and install the Facebook fb303 library. From the top-level thrift directory:

[user@localhost] cd contrib/fb303

[user@localhost] ./bootstrap.sh

[user@localhost] ./configure

[user@localhost] make

[user@localhost] sudo su -

[root@localhost] make install

[root@localhost] exit

15. hadoop 0.21.0 - http://www.apache.org/dyn/closer.cgi/hadoop/core/

[user@localhost] cd hadoop-0.21.0/hdfs/src/c++/libhdfs

[user@localhost] ./configure JVM_ARCH=tune=k8 --with-java=/usr/java/jdk1.6.0_20

[user@localhost] make

[user@localhost] sudo su -

[root@localhost] cp .libs/libhdfs.so .libs/libhdfs.so.0 /usr/local/include 

[root@localhost] cp hdfs.h /usr/local/include 

[root@localhost] exit

16. scribe-2.2 - http://github.com/downloads/facebook/scribe/scribe-2.2.tar.gz - have to use scribe-2.1 or above to support HDFS.

[user@localhost] ./bootstrap.sh --enable-hdfs

[user@localhost] ./configure

[user@localhost] make

17. Configure and Run Hadoop (single-node cluster in this tutorial).

a. First we need to modify a few configuration files. From top-level hadoop directory, edit conf/hadoop-env.sh, conf/core-site.xml, conf/hdfs-site.xml and conf/macoded-site.xml files.

[user@localhost] more conf/hadoop-env.sh



export HADOOP_OPTS=-Djava.net.codeferIPv4Stack=true



# Set Hadoop-specific environment variables here.



# The only required environment variable is JAVA_HOME.  All others are

# optional.  When running a distributed configuration it is best to

# set JAVA_HOME in this file, so that it is correctly defined on

# remote nodes.



# The java implementation to use.  Required.

export JAVA_HOME=/usr/java/jdk1.6.0_20

.

.

.



[user@localhost] more conf/core-site.xml



<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>



<!-- Put site-specific property overrides in this file. -->



<configuration>

<property>

  <name>hadoop.tmp.dir</name>

  <value>/tmp/hadoop</value>

  <description>A base for other temporary directories.</description>

</property>



<property>

  <name>fs.default.name</name>

  <value>hdfs://localhost:9000</value>

  <description>The name of the default file system.  A URI whose

  scheme and authority determine the FileSystem implementation.  The

  uri's scheme determines the config property (fs.SCHEME.impl) naming

  the FileSystem implementation class.  The uri's authority is used to

  determine the host, port, etc. for a filesystem.</description>

</property>

</configuration>



[user@localhost] more conf/hdfs-site.xml



<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>



<!-- Put site-specific property overrides in this file. -->



<configuration>

<property>

  <name>dfs.replication</name>

  <value>1</value>

  <description>Default block replication.

  The actual number of replications can be specified when the file is created.

  The default is used if replication is not specified in create time.

  </description>

</property>

</configuration>



[user@localhost] more conf/macoded-site.xml



<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>



<!-- Put site-specific property overrides in this file. -->



<configuration>

<property>

  <name>macoded.job.tracker</name>

  <value>localhost:9001</value>

  <description>The host and port that the Macodeduce job tracker runs

  at.  If "local", then jobs are run in-process as a single map

  and reduce task.

  </description>

</property>

</configuration>

b. Format the namenode. From top-level hadoop directory

[user@localhost] bin/hadoop namenode -format

c. Start hadoop

[user@localhost] start-all.sh

d. Use jps to check if all the processes are started.

[user@localhost] jps

25362 JobTracker

24939 NameNode

25099 DataNode

25506 TaskTracker

25251 SecondaryNameNode

25553 Jps

e. Use netstat to check if port 9000 (set in core-site.xml) is listening.

[user@localhost] sudo netstat -nap | grep 9000

sudo netstat -nap | grep 9000

tcp        0      0 127.0.0.1:9000              0.0.0.0:*                   LISTEN      24939/java         

tcp        0      0 127.0.0.1:9000              127.0.0.1:59957             ESTABLISHED 24939/java         

tcp        0      0 127.0.0.1:9000              127.0.0.1:59960             ESTABLISHED 24939/java         

tcp        0      0 127.0.0.1:59957             127.0.0.1:9000              ESTABLISHED 25099/java         

tcp        0      0 127.0.0.1:59960             127.0.0.1:9000              ESTABLISHED 25251/java

f. Open a browser and type server_ip:50070 to check if it shows 1 Live Nodes in the Cluster Summary. Be patient, sometimes it takes some time (30 seconds to 1 minute) to show. Remember to put the cursor to the address bar and codess enter to refresh the page. For some reason, F5 (Reload) doesn't work for me.

18. Configure and Run Scribe
a. Set up the java class paths for Scribe. I have installed my hadoop 0.21.0 in ~/pkgs/hadoop-0.21.0 directory

[user@localhost] export CLASSPATH=~/pkgs/hadoop-0.21.0/hadoop-hdfs-0.21.0.jar:~/pkgs/hadoop-0.21.0/lib/commons-logging-1.1.1.jar:~/pkgs/hadoop-0.21.0/hadoop-common-0.21.0.jar

b. Edit a Scribe configuration file to use HDFS. Change to scribe src directory

[user@localhost] more scribe_hdfs.conf

port=1463

max_msg_per_second=2000000

check_interval=3



# DEFAULT

<store>

category=default

type=buffer



target_write_size=20480

max_write_interval=1

buffer_send_rate=2

retry_interval=30

retry_interval_range=10



<primary>

type=file

fs_type=hdfs

file_path=hdfs://localhost:9000/scribetest

create_symlink=no

use_hostname_sub_directory=yes

base_filename=thisisoverwritten

max_size=40000000

rotate_period=daily

rotate_hour=0

rotate_minute=5

add_newlines=1

</primary>

<secondary>

type=file

fs_type=std

file_path=/tmp/scribetest

base_filename=thisisoverwritten

max_size=3000000

</secondary>

</store>

c. Run Scribe

[user@localhost] scribed scribe_hdfs.conf

[Thu Sep  9 15:35:22 2010] "setrlimit error (setting max fd size)"

[Thu Sep  9 15:35:22 2010] "STATUS: STARTING"

[Thu Sep  9 15:35:22 2010] "STATUS: configuring"

[Thu Sep  9 15:35:22 2010] "got configuration data from file <scribe_hdfs.conf>"

[Thu Sep  9 15:35:22 2010] "CATEGORY : default"

[Thu Sep  9 15:35:22 2010] "Creating default store"

[Thu Sep  9 15:35:22 2010] "configured <1> stores"

[Thu Sep  9 15:35:22 2010] "STATUS: "

[Thu Sep  9 15:35:22 2010] "STATUS: ALIVE"

[Thu Sep  9 15:35:22 2010] "Starting scribe server on port 1463"

Thrift: Thu Sep  9 15:35:22 2010 libevent 1.4.13-stable method epoll

If it cannot run and complains the following:
libboost_system.so.1.40.0: cannot open shared object

Do the following:

[user@localhost] echo '/usr/local/lib/' >> /etc/ld.so.conf.d/my_boost.conf

/sbin/ldconfig -v

If you see the following output when running scribe:

[user@localhost] scribed scribe_hdfs.conf

[Thu Sep  9 15:39:38 2010] "setrlimit error (setting max fd size)"

[Thu Sep  9 15:39:38 2010] "STATUS: STARTING"

[Thu Sep  9 15:39:38 2010] "STATUS: configuring"

[Thu Sep  9 15:39:38 2010] "got configuration data from file <scribe_hdfs.conf>"

[Thu Sep  9 15:39:38 2010] "CATEGORY : default"

[Thu Sep  9 15:39:38 2010] "Creating default store"

[Thu Sep  9 15:39:38 2010] "configured <1> stores"

[Thu Sep  9 15:39:38 2010] "STATUS: "

[Thu Sep  9 15:39:38 2010] "STATUS: ALIVE"

[Thu Sep  9 15:39:38 2010] "Starting scribe server on port 1463"

[Thu Sep  9 15:39:38 2010] "Exception in main: TNonblockingServer::serve() bind"

[Thu Sep  9 15:39:38 2010] "scribe server exiting"

It may be due to Port 1463 not being available. Run "netstat -nap | grep 1463" to find out which program is using it.

19. Send something to Scribe to log in HDFS
From a different terminal, in top-level Scribe directory:

[user@localhost] echo "hello world" | examples/scribe_cat test

In the terminal that runs Scribe server, you should see the following output:

[Thu Sep  9 15:46:14 2010] "[test] Creating new category from model default"

[Thu Sep  9 15:46:14 2010] "store thread starting"

[Thu Sep  9 15:46:14 2010] "[hdfs] Connecting to HDFS"

[Thu Sep  9 15:46:14 2010] "[hdfs] Before hdfsConnectNewInstance(localhost, 9000)"

Sep 9, 2010 3:46:14 PM org.apache.hadoop.security.Groups 

INFO: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000

[Thu Sep  9 15:46:15 2010] "[hdfs] After hdfsConnectNewInstance"

[Thu Sep  9 15:46:15 2010] "[hdfs] Connecting to HDFS"

[Thu Sep  9 15:46:15 2010] "[hdfs] Before hdfsConnectNewInstance(localhost, 9000)"

[Thu Sep  9 15:46:15 2010] "[hdfs] After hdfsConnectNewInstance"

[Thu Sep  9 15:46:15 2010] "[hdfs] opened for write hdfs://localhost:9000/scribetest/test/localhost.localdomain/test-2010-09-09_00000"

[Thu Sep  9 15:46:15 2010] "[test] Opened file  for writing"

[Thu Sep  9 15:46:15 2010] "[test] Opened file  for writing"

[Thu Sep  9 15:46:15 2010] "[test] Changing state from  to "

Opening Primary

[Thu Sep  9 15:46:15 2010] "[test] successfully read <0> entries from file "

[Thu Sep  9 15:46:15 2010] "[test] No more buffer files to send, switching to streaming mode"

[Thu Sep  9 15:46:15 2010] "[test] Changing state from  to "

20. Check if the message has been logged to HDFS:
Stop Scribe first (either stop Scribe or make the file rotate, otherwise Hadoop won't write it to the filesystem.)

[user@localhost] hadoop fs -lsr /

10/09/09 16:26:02 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000

10/09/09 16:26:03 WARN conf.Configuration: macoded.task.id is decodecated. Instead, use macodeduce.task.attempt.id

drwxr-xr-x   - user supergroup          0 2010-09-09 16:21 /jobtracker

drwxr-xr-x   - user supergroup          0 2010-09-09 16:21 /jobtracker/jobsInfo

drwxr-xr-x   - user supergroup          0 2010-09-09 16:23 /scribetest

drwxr-xr-x   - user supergroup          0 2010-09-09 16:23 /scribetest/test

drwxr-xr-x   - user supergroup          0 2010-09-09 16:23 /scribetest/test/localhost.localdomain

-rw-r--r--   3 user supergroup         13 2010-09-09 16:25 /scribetest/test/localhost.localdomain/test-2010-09-09_00000

drwxr-xr-x   - user supergroup          0 2010-09-09 16:21 /tmp

drwxr-xr-x   - user supergroup          0 2010-09-09 16:21 /tmp/hadoop

drwxr-xr-x   - user supergroup          0 2010-09-09 16:21 /tmp/hadoop/macoded

drwx------   - user supergroup          0 2010-09-09 16:21 /tmp/hadoop/macoded/system

-rw-------   1 user supergroup          4 2010-09-09 16:21 /tmp/hadoop/macoded/system/jobtracker.info

Get the directory out to take a look:

[user@localhost] hadoop fs -get /scribetest test

10/09/09 16:26:47 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000

10/09/09 16:26:47 WARN conf.Configuration: macoded.task.id is decodecated. Instead, use macodeduce.task.attempt.id



[user@localhost] more test/test/localhost.localdomain/test-2010-09-09_00000

hello world

One note about the Secondary store in Scribe configuration file: Scribe opens the files for both Primary and Secondary stores even in the normal situation as long as replay_buffer is true (default). Then it will try to delete the secondary store file when Primary store is handling the messages. This is causing problem because HDFS has not completed its access to the secondary store file. The following exception will happen:

[Thu Sep  9 16:02:03 2010] "[hdfs] deleteFile hdfs://localhost:9000/scribetest1/test/localhost.localdomain/test_00000"

[Thu Sep  9 16:02:03 2010] "[hdfs] Connecting to HDFS"

[Thu Sep  9 16:02:03 2010] "[hdfs] Before hdfsConnectNewInstance(localhost, 9000)"

[Thu Sep  9 16:02:03 2010] "[hdfs] After hdfsConnectNewInstance"

[Thu Sep  9 16:02:03 2010] "[test] No more buffer files to send, switching to streaming mode"

Exception in thread "main" java.io.IOException: Could not complete write to file /scribetest1/test/localhost.localdomain/test_00000 by DFSClient_1545136365

        at org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:720)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

        at java.lang.reflect.Method.invoke(Method.java:597)

        at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:342)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1350)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1346)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742)

        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1344)



        at org.apache.hadoop.ipc.Client.call(Client.java:905)

        at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198)

        at $Proxy0.complete(Unknown Source)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

        at java.lang.reflect.Method.invoke(Method.java:597)

        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)

        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)

        at $Proxy0.complete(Unknown Source)

        at org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:1406)

        at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1393)

        at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:66)

        at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:91)

Call to org/apache/hadoop/fs/FSDataOutputStream::close failed!

[Thu Sep  9 16:02:03 2010] "[hdfs] closed hdfs://localhost:9000/scribetest1/test/localhost.localdomain/test_00000"

There is more information about this error in "NameNode Logs" link on http://server_ip:50070/dfshealth.jsp page.

To avoid this problem we should either set replay_buffer to false or make the seconardy store local instead of HDFS (the above configuration file example, scribe_hdfs.conf).

The following configuration is to set replay_buffer to false and both primary and secondary stores to use HDFS.

[user@localhost] more hdfs_both.conf

port=1463

max_msg_per_second=2000000

check_interval=3



# DEFAULT

<store>

category=default

type=buffer

replay_buffer=no

target_write_size=20480

max_write_interval=1

buffer_send_rate=2

retry_interval=30

retry_interval_range=10



<primary>

type=file

fs_type=hdfs

file_path=hdfs://localhost:9000/scribetest

create_symlink=no

use_hostname_sub_directory=yes

base_filename=thisisoverwritten

max_size=40000000

rotate_period=daily

rotate_hour=0

rotate_minute=5

add_newlines=1



</primary>

<secondary>

type=file

fs_type=hdfs

file_path=hdfs://localhost:9000/scribetest1

create_symlink=no

use_hostname_sub_directory=yes

base_filename=thisisoverwritten

max_size=40000000

rotate_period=daily

rotate_hour=0

rotate_minute=5

add_newlines=1

</secondary>



</store>

References:
1. Thomas Dudziak's blog: How to install Scribe with HDFS support on Ubuntu Karmic
2. Agile Testing: Compiling, installing and test-running Scribe
3. Google Scribe server group: Failover when writing to HDFS problems

16 comments:

Apple said...: your blog helps us much ! thank you for post it !; September 17, 2010 at 1:55 AM
ChiMaster said...: Thank you! I'm glad you like it. It feels good to be able to help.; September 20, 2010 at 3:59 PM
baysao said...: Thank you very much !; October 6, 2010 at 3:26 AM
Trang Anh said...: Dear all,

i built scribe using that tutorial but i had problems:

conn_pool.o: In function `__static_initialization_and_destruction_0(int, int)':
conn_pool.cpp:(.text+0x71): undefined reference to `boost::system::get_system_category()'
conn_pool.cpp:(.text+0x7b): undefined reference to `boost::system::get_generic_category()'
conn_pool.cpp:(.text+0x85): undefined reference to `boost::system::get_generic_category()'
conn_pool.cpp:(.text+0x8f): undefined reference to `boost::system::get_generic_category()'
conn_pool.cpp:(.text+0x99): undefined reference to `boost::system::get_system_category()'
scribe_server.o: In function `__static_initialization_and_destruction_0(int, int)':
scribe_server.cpp:(.text+0xe1): undefined reference to `boost::system::get_system_category()'
scribe_server.cpp:(.text+0xeb): undefined reference to `boost::system::get_generic_category()'
scribe_server.cpp:(.text+0xf5): undefined reference to `boost::system::get_generic_category()'
scribe_server.cpp:(.text+0xff): undefined reference to `boost::system::get_generic_category()'
scribe_server.cpp:(.text+0x109): undefined reference to `boost::system::get_system_category()'
Anybody tell me why ?

thanks in advance; March 31, 2011 at 10:41 AM
ChiMaster said...: Have you done Step 6?

Please make sure you have the libboost_* libraries in your /usr/lib directory.; March 31, 2011 at 12:06 PM
Trang Anh said...: thanks so much for replying.

i using default ,i realized that the libboost* located in /usr/local/lib/ so that it can not reference to libboost* lib when built scribe, then i edited 2 properties in 3 Makefile files

From
BOOST_CPPFLAGS = -I/usr/include
BOOST_LDFLAGS = -L/usr/lib

To:
BOOST_CPPFLAGS = -I/usr/local/include
BOOST_LDFLAGS = -L/usr/local/lib

finally, i have built scribe.

Trang anh; March 31, 2011 at 11:58 PM
Trang Anh said...: Dear all,

Anybody show me how to read data stored server using java?

thanks && regards

Trang Anh.; April 1, 2011 at 8:05 AM
Trang Anh said...: dear all,

i performed this tutotrial and i had this error when building thrift. i used thrift 0.5, centos 5.5, 64 bit
....................
checking for egrep... /bin/grep -E
checking for a sed that does not truncate output... /bin/sed
checking for cc... cc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether cc accepts -g... yes
checking for cc option to accept ISO C89... none needed
checking how to run the C preprocessor... cc -E
checking for icc... no
checking for suncc... no
checking whether cc understands -c and -o together... yes
checking for system library directory... lib
checking if compiler supports -R... no
checking if compiler supports -Wl,-rpath,... yes
checking build system type... x86_64-unknown-linux-gnu
checking host system type... x86_64-unknown-linux-gnu
checking target system type... x86_64-unknown-linux-gnu
configure: error: Cannot find php-config. Please use --with-php-config=PATH
configure: error: ./configure failed for lib/php/src/ext/thrift_protocol

Anybody can show me how to fix it

thanks in advance.; May 9, 2011 at 10:36 PM
ChiMaster said...: Please install php-devel.

$ sudo yum install php-devel.x86_64; May 10, 2011 at 11:20 PM
Trang Anh said...: Dear all,

Enviroment:Centos 5.5
i install scribe log server follow above tutorial sucessfully but when i send message to scribe log then server throws exception:

Error occurred during initialization of VM
Unable to load native library: /usr/local/libjava.so: cannot open shared object file: No such file or directory

Any idea for me to resolve the problem .

Thanks in advance; October 6, 2011 at 1:29 AM
AJ said...: I have Scribe server running which can write events to a port on which Flume can listen and in turn writes to HDFS. Now, I am trying to configure Scribe to support HDFS. For this, I reconfigured Scribe with this command and options,

./configure --with-hadooppath=/opt/hadoop-2.0.2/ CPPFLAGS="-DHAVE_INTTYPES_H -DHAVE_NETINET_IN_H -Lopt/hadoop-2.0.2/lib/native/ -I$JAVA_HOME/java/jdk1.6.0_31" -with-boost-filesystem=boost_filesystem --enable-hdfs

After this, while trying to execute "make && make install", the compilation terminated with following error,

/opt/hadoop-2.0.2//include/hdfs.h: In member function 'virtual void HdfsFile::deleteFile()':
/opt/hadoop-2.0.2//include/hdfs.h:437: error: too few arguments to function 'int hdfsDelete(hdfs_internal*, const char*, int)'
HdfsFile.cpp:134: error: at this point in file
HdfsFile.cpp: In member function 'hdfs_internal* HdfsFile::connectToPath(const char*)':
HdfsFile.cpp:203: warning: deprecated conversion from string constant to 'char*'

So, I edited HdfsFile.cpp and added an argument to the hdfsDelete function as

hdfsDelete(fileSys, filename.c_str(),1); ( Previously it was hdfsDelete(fileSys, filename.c_str());)

After this, I again reconfigured scribe and executed make && make install, then it gave following error,

HdfsFile.o: In function `HdfsFile::deleteFile()':
HdfsFile.cpp:(.text+0xe6): undefined reference to `hdfsDelete'
HdfsFile.o: In function `HdfsFile::fileSize()':
HdfsFile.cpp:(.text+0x152): undefined reference to `hdfsGetPathInfo'
HdfsFile.cpp:(.text+0x168): undefined reference to `hdfsFreeFileInfo'
HdfsFile.o: In function `HdfsFile::write(std::basic_string, std::allocator > const&)':
HdfsFile.cpp:(.text+0x1d8): undefined reference to `hdfsWrite'
HdfsFile.o: In function `HdfsFile::close()':
HdfsFile.cpp:(.text+0x21e): undefined reference to `hdfsCloseFile'
HdfsFile.o: In function `HdfsFile::openWrite()':
HdfsFile.cpp:(.text+0x2f5): undefined reference to `hdfsExists'
HdfsFile.cpp:(.text+0x318): undefined reference to `hdfsOpenFile'
HdfsFile.o: In function `HdfsFile::openRead()':
HdfsFile.cpp:(.text+0x3fe): undefined reference to `hdfsOpenFile'
HdfsFile.o: In function `HdfsFile::~HdfsFile()':
HdfsFile.cpp:(.text+0x489): undefined reference to `hdfsDisconnect'
HdfsFile.o: In function `HdfsFile::connectToPath(char const*)':
HdfsFile.cpp:(.text+0x59d): undefined reference to `hdfsConnectNewInstance'
HdfsFile.o: In function `HdfsFile::~HdfsFile()':
HdfsFile.cpp:(.text+0x8d9): undefined reference to `hdfsDisconnect'
HdfsFile.o: In function `HdfsFile::listImpl(std::basic_string, std::allocator > const&, std::vector, std::allocator >, std::allocator, std::allocator > > >&)':
HdfsFile.cpp:(.text+0x944): undefined reference to `hdfsExists'
HdfsFile.cpp:(.text+0x976): undefined reference to `hdfsListDirectory'
HdfsFile.cpp:(.text+0xad9): undefined reference to `hdfsFreeFileInfo'
HdfsFile.o: In function `HdfsFile::flush()':
HdfsFile.cpp:(.text+0x18e): undefined reference to `hdfsFlush'
collect2: ld returned 1 exit status

I am not getting what exactly I am missing.

Following are the installed softwares and environment variables,

Centos x86_64 6.3,
hadoop-2.0.2,
scribe-2.1,
thrift-0.9.0,
boost-1.41

libhdfs is located at "/opt/hadoop-2.0.2/lib/native/"

HADOOP_HOME is set to /opt/hadoop

and following are the contents of .bashrc

export LD_LIBRARY_PATH=/disk1/root/build/thrift-0.9.0:/disk1/root/build/thrift-0.9.0/contrib/fb303:/usr/local/lib/
export PATH=$LD_LIBRARY_PATH:$SCRIBE_HOME:$PATH

If it is the problem with version of Hadoop or libhdfs, please tell which version will be needed for the given system configuration and also how can we separately configure libhdfs before trying to configure Scribe with HDFS.

Please help in this regard.; March 14, 2013 at 1:26 AM
Arjun kumar said...: You have furnished a worthable content in here. I find this article useful to read and share. I have already bookmarked your content for future updates. Thanks for sharing.

Android training chennai | Selenium training chennai | software testing training in chennai; October 24, 2015 at 5:21 AM
Unknown said...: The future of software testing is on positive note. It offers huge career prospects for talented professionals to be skilled software testers.
Regards,
Software testing courses in chennai|testing training chennai|Software training institutes in chennai; July 4, 2016 at 5:10 AM
Unknown said...: Thanks for sharing such an useful post here.
Informatica Training institute in chennai | Informatica Training in Chennai | Informatica courses in Chennai; August 12, 2016 at 3:44 AM
Aurthur said...: Much obliged to you for requiring significant investment to give us a portion of the valuable and restrictive data with us.
Informatica Training in Chennai | Informatica course in Chennai; August 19, 2016 at 6:12 AM
Unknown said...: Excellent posting. This one of the best resources, I had ever found like this.
DOTNET Training in Chennai | dotnet courses in Chennai | .net training Chennai; August 23, 2016 at 12:28 AM