I want to run nutch on the linux kernel,I have loged in as a root user, I have setted all the environment variable and nutch file setting. I have created a url.txt file which content the url to crawl, When i am trying to run nutch using following command,
bin/nutch crawl urls -dir pra
it generates following exception.
crawl started in: pra
rootUrlDir = urls
threads = 10
depth = 5
Injector: starting
Injector: crawlDb: pra/crawldb
Injector: urlDir: urls
Injector: Converting injected urls to crawl db entries.
Exception in thread "main" java.io.IOException: Failed to get the current user's information.
at org.apache.hadoop.mapred.JobClient.getUGI(JobClient.java:717)
at org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:592)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:788)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1142)
at org.apache.nutch.crawl.Injector.inject(Injector.java:160)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:113)
Caused by: javax.security.auth.login.LoginException: Login failed: Cannot run program "whoami": java.io.IOException: error=12, Cannot allocate memory
at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:250)
at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:275)
at org.apache.hadoop.mapred.JobClient.getUGI(JobClient.java:715)
... 5 more
Server has enough space to run any java application.I have attached the statics..
total used free
Mem: 524320 194632 329688
-/+ buffers/cache: 194632 329688
Swap: 2475680 0 2475680
Total: 3000000 194632 2805368
Is it sufficient memory space for nutch? Please some one help me ,I am new with linux kernel and nutch. Thanks in Advance.
Read the output:
Looks like you don't have enough RAM or no swap file/partition.
Calls to executables (like whoami) in Java require making an entire copy of the Java process first. You will want to drop your maximum heap size (-Xmx256m) to where you may have two copies in RAM at the same time.
In 32 bit installation of an Operating System the JVM(Java Virtual Machine) can not handle memory larger that 4GB. If you want to use JVM to take more than 4GB then you have to use 64bit version of the JVM which also means that the Operating System should also be 64 bit version.
I presume that is why you are getting that error. You have 5GB memory and that could be the problem. You should either tell your application to only use 75% of the available memory or try reducing the RAM to 4GB and checking. I had the same issue in Zimbra Messaging solution which uses Java for the Web interface.
It is possible that your server has disabled /proc/sys/vm/overcommit_memory. Without overcommit, a "fork" system call requires that your server have enough RAM or swap for a complete second copy of the Java process. This may be a lot of RAM.