sphinxsearch questions - Page 1

Arkaaito

Asked: 2014-03-13 12:06:11 +0800 CST

How do I choose an appropriate max_children value for Sphinx?

2

We have a Sphinx install (2.0.3) running on a cluster of 3 EC2 instances (currently m3.large).

Currently we have workers = threads and max_children = 30 in our Sphinx config (same on each box). We are periodically getting the dreaded "temporary searchd error: server maxed out, retry in a second". Our instances are hovering around 5% CPU utilization. Some example top output:

top - 19:51:56 up 22:15,  1 user,  load average: 0.08, 0.04, 0.01
Tasks:  82 total,   2 running,  80 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.0%us,  0.0%sy,  0.0%ni, 98.5%id,  0.3%wa,  0.0%hi,  0.0%si,  0.2%st
Mem:   7872040k total,  2911920k used,  4960120k free,   245168k buffers
Swap:        0k total,        0k used,        0k free,  2190992k cached

All the Sphinx doc seems to say about setting max_children is that it is "useful to control server load". While searching I found a forum post indicating that setting it either too high or too low can cause "server maxed out" - I presume the former is because the individual queries are starved - but had no further tips on choosing the right level. (I can't find the link to this post again to save my life. Sorry.)

Two related questions:

Am I right in thinking the low CPU suggests max_children could/should be higher than 30?
How can I find the optimal number (i.e., the max number of children which [usually] does not lead to query slowdown)? I'm not entirely sure what kind of info Sphinx logs beyond query.log. Is there a tool I can use to determine whether query slowdown is occurring (due to too many parallel queries), and if not, are queries CPU-bound or memory-bound (or should I be looking at some other value entirely)?

thwd

Asked: 2012-05-19 12:03:36 +0800 CST

Simple full-text search server

3

I have been looking at search solutions like Sphinx, Solr and Elasticsearch but they are all way too complex for what I need.

I'm basically looking for a server software, best distributed, that allows me to just throw in chunks of text associated with a single small identifier each. Then find keywords quickly and return the identifiers given for the chunks that yielded a match with one or more of the keywords.

Does something like this exist?

loosecannon

Asked: 2011-05-16 12:58:29 +0800 CST

/etc/rc.local not being run on Ubuntu Desktop Install

3

I have been trying to get sphinx to run at boot, so I added some lines to /etc/rc.local but nothing happens when I start up. If i run it manually it works however.

/etc/init.d/rc.local start works fine as does /etc/rc.local

It's listed in the default runlevel and is all executable but it does not work.

I am considering writing a separate init.d script to do the same thing but that's a lot of work for a simple task

dumbledore:/etc/init.d# ls -l rc*
-rwxr-xr-x 1 root root 8863 2009-09-07 13:58 rc
-rwxr-xr-x 1 root root  801 2009-09-07 13:58 rc.local
-rwxr-xr-x 1 root root  117 2009-09-07 13:58 rcS

dumbledore:/etc/init.d# ls /etc/rc.local  -l
-rwxr-xr-x 1 root root 491 2011-05-14 16:13 /etc/rc.local

dumbledore:/etc/init.d# runlevel
N 2

dumbledore:/etc/init.d# ls /etc/rc2.d/ -l
total 4
lrwxrwxrwx 1 root root  18 2011-04-22 18:53 K08vmware -> /etc/init.d/vmware
-rw-r--r-- 1 root root 677 2011-03-28 15:10 README
lrwxrwxrwx 1 root root  18 2011-04-22 18:53 S19vmware -> /etc/init.d/vmware
lrwxrwxrwx 1 root root  18 2011-05-15 14:09 S20ddclient -> ../init.d/ddclient
lrwxrwxrwx 1 root root  20 2011-03-10 18:00 S20fancontrol -> ../init.d/fancontrol
lrwxrwxrwx 1 root root  20 2011-03-10 18:00 S20kerneloops -> ../init.d/kerneloops
lrwxrwxrwx 1 root root  27 2011-03-10 18:00 S20speech-dispatcher -> ../init.d/speech-dispatcher
lrwxrwxrwx 1 root root  19 2011-03-10 18:00 S25bluetooth -> ../init.d/bluetooth
lrwxrwxrwx 1 root root  20 2011-03-10 18:00 S50pulseaudio -> ../init.d/pulseaudio
lrwxrwxrwx 1 root root  15 2011-03-10 18:00 S50rsync -> ../init.d/rsync
lrwxrwxrwx 1 root root  15 2011-03-10 18:00 S50saned -> ../init.d/saned
lrwxrwxrwx 1 root root  19 2011-03-10 18:00 S70dns-clean -> ../init.d/dns-clean
lrwxrwxrwx 1 root root  18 2011-03-10 18:00 S70pppd-dns -> ../init.d/pppd-dns
lrwxrwxrwx 1 root root  14 2011-05-07 11:22 S75sudo -> ../init.d/sudo
lrwxrwxrwx 1 root root  24 2011-03-10 18:00 S90binfmt-support -> ../init.d/binfmt-support
lrwxrwxrwx 1 root root  17 2011-05-12 21:18 S91apache2 -> ../init.d/apache2
lrwxrwxrwx 1 root root  22 2011-03-10 18:00 S99acpi-support -> ../init.d/acpi-support
lrwxrwxrwx 1 root root  21 2011-03-10 18:00 S99grub-common -> ../init.d/grub-common
lrwxrwxrwx 1 root root  18 2011-03-10 18:00 S99ondemand -> ../init.d/ondemand
lrwxrwxrwx 1 root root  18 2011-03-10 18:00 S99rc.local -> ../init.d/rc.local

dumbledore:/etc/init.d# cat /etc/rc.local

#!/bin/sh -e
#
# rc.local
#
# This script is executed at the end of each multiuser runlevel.
# Make sure that the script will "exit 0" on success or any other
# value on error.
#
# In order to enable or disable this script just change the execution
# bits.
#
# By default this script does nothing.

# Start sphinx daemon for rails app on startup
# Added 2011-05-13
# Cannon Matthews
cd /var/www/extemp
/usr/bin/rake ts:config
/usr/bin/rake ts:start
touch ./tmp/ohyeah
cd -

exit 0

dumbledore:/etc/init.d# cat /etc/init.d/rc.local

#! /bin/sh
### BEGIN INIT INFO
# Provides:          rc.local
# Required-Start:    $remote_fs $syslog $all
# Required-Stop:
# Default-Start:     2 3 4 5
# Default-Stop:
# Short-Description: Run /etc/rc.local if it exist
### END INIT INFO


PATH=/sbin:/usr/sbin:/bin:/usr/bin

. /lib/init/vars.sh
. /lib/lsb/init-functions

do_start() {
    if [ -x /etc/rc.local ]; then
            [ "$VERBOSE" != no ] && log_begin_msg "Running local boot scripts (/etc/rc.local)"
        /etc/rc.local
        ES=$?
        [ "$VERBOSE" != no ] && log_end_msg $ES
        return $ES
    fi
}

case "$1" in
    start)
    do_start
        ;;
    restart|reload|force-reload)
        echo "Error: argument '$1' not supported" >&2
        exit 3
        ;;
    stop)
        ;;
    *)
        echo "Usage: $0 start|stop" >&2
        exit 3
        ;;
esac

Ian

Asked: 2010-02-23 07:14:23 +0800 CST

Site crawler/spider that tosses results into mysql

4

It's been suggested that we use mysql for our site's search as it'd be running on the same server that hosts our web server (nginx) and our db (mysql).

Since not all of our pages are created from the database, it's been suggested that we have a crawler that can crawl the site, and toss the page url and data into mysql and have sphinx index on that.

Does anyone know of an open source spider that has a mysql storing option out of the box.

Thanks.

4 revs, 3 users 67%anon

Asked: 2009-05-16 06:06:56 +0800 CST

How do I make sphinx restart when I reboot my Ubuntu server?

12

I built and installed sphinx search on my ubuntu 9.04 server.

How do I make the sphinx daemon start automatically when I reboot?

How do I choose an appropriate max_children value for Sphinx?

Simple full-text search server

/etc/rc.local not being run on Ubuntu Desktop Install

Site crawler/spider that tosses results into mysql

How do I make sphinx restart when I reboot my Ubuntu server?

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?

Questions[sphinxsearch](server)