Page 1 of 3

sphinx

Posted: Tue Jan 09, 2018 1:37 pm
by diszk
Hello,

could you please tell me if its possible to use sphinx search without search keyword? I'd like to use sphinx for "all popular", "all new" pages. many thanks in advance

Re: sphinx

Posted: Tue Jan 09, 2018 2:09 pm
by admin
Hi, what's the point to do it?
Sphinx is designed to perform text search? are you going somehow copy CTR data into sphinx each minute?

Re: sphinx

Posted: Tue Jan 09, 2018 2:21 pm
by diszk
thanks for the fast reply. So I have 600k galleries and there will be approx 5M galleries. As far as I see when I open a category page with group_id=10 that its much more slower than with search=amateur (sphinx search). Also with sphinx way I don't have to limit the pages so the overall experience is better. I'd like to do this on all popular etc pages where there is no search keyword.

So I'd like to query sphinx without MATCH and that way I can get all the galleries sorted by date/duration. I'm using sphinx real time index in my own projects in that way, but I don't really know how to archive this with standard sphinx index delta way. thank you

Re: sphinx

Posted: Tue Jan 09, 2018 2:30 pm
by admin
If you are going to sort galleries using sphinx it would slow it down caz sphinx is designed to search for keywords, not to sort it somehow.
There's a dedicated field for date and duration in DB and Mysql should do the sorting.

Did you check slow queries and checked what indexes it uses ?

did you try updates ?

Re: sphinx

Posted: Tue Jan 09, 2018 2:47 pm
by diszk
the update is done by cron, daily 1-2 times, the slow query is at the end of this post. As i remember I checked and every where condition parameters are indexed. I guess that query is for the amount of galleries(<!--TOTAL_ITEMS-->) and I could just ignore this and set up hardcoded pages so it won't need to count the galleries but I'm lunatic and I really wanted to try to show the max amount of galleries and max pages for the visitor. But if you say its not possible actually, I'll accept it and thank you



19:16::query_time=104.92472696304::query=SELECT SQL_CALC_FOUND_ROWS gs.gallery_id, gs.thumb_id, gi.sponsor_id, gi.content_count, gi.content_type, gs.total_shows, gs.total_clicks, gs.total_ctr, gt.thumb_url, gd.*, gi.content_count, gi.crop_profile_id, gi.activation_date, gi.added_date, gi.duration, gi.url, gi.gallery_total_shows, gi.sponsor_id, gi.source_url, gi.custom_gallery FROM rot_gallery_stats24 as gs FORCE INDEX (activation_date) JOIN rot_gallery_info as gi on gi.gallery_id = gs.gallery_id
JOIN rot_gallery_data1 as gd on gd.gallery_id = gs.gallery_id
JOIN rot_thumbs as gt on gt.thumb_id = gs.thumb_id
WHERE 1 = 1 AND gi.gallery_status = 'active' AND gi.gallery_type = 0 and gs.best_thumb = 'yes' and gs.group_id = 0 ORDER BY gs.activation_date DESC LIMIT 0, 100# queryitems::

Re: sphinx

Posted: Tue Jan 09, 2018 6:50 pm
by admin
So you think sphinx will somehow count faster then mysql ?

And what's the point to show exact amount?

Re: sphinx

Posted: Thu Jan 11, 2018 12:01 am
by diszk
Sorry for the delay, so actually no, I was completely wrong, I'm sorry. The load speed difference between search and group pages is only because of the different sql calc query, the search one much more faster. Anyway, there is no point about showing the amount of galleries, I know that most of the visitors don't visit subpages as those are mostly satellite sites to the big tubes. The only reason I wanted to do is that I saw a site and I think it's based on SCJ and I saw the amount of galleries everywhere but it could be done if I just write a small script which could get the number from the database once a day and store it in redis or in a json text etc and just include it in the template. Thank you anyway!

Do you plan to change to real time indexes?

Re: sphinx

Posted: Thu Jan 11, 2018 6:02 pm
by admin
It depends on how many pages you have, you need this number for each category and so on ..
Why don't you just add a random number?

Did you try delta indexes?

Yes, I think to try Sphinx 3.0 in a month or so. If we want to start using RT then we should start with the latest version)

Re: sphinx

Posted: Fri Jan 12, 2018 2:33 pm
by diszk
aww, I did not even notice that sphinx 3.0 is live, thanks for the headup, I'm so exited:) Also, the delta indexes are used by default. A random number is just too easy:) and I'm sure that the one who did it, shows the correct number, but now, as I know how can I archive this, its not really matters anymore.

Some more questions if you don't mind:
1) I see that there is a possibility to automatically translate the entire site with google cloud. But I cannot get in with personal account, it trying to force me to register a business account. Do you have any workaround for this?

2) if I set up lets say 10 different languages and set "different stats for languages" that means the rot_gallery_stats1...x size would be defaultsize*languages?

3) for the master/slave setup: if I add a new slave and the slave shows "Connecting..." as far as I see its doing database processes like adding a new rot_gallery_stats21 and so on. What would happen, if the server crashes while its working?

4) do you have anything in your mind about mysql replication in this SCJ context? I'd like to setup a different server for grabbing the galleries and another for processing the websites so I can avoid to slow down the website server also it won'T generate duplicated key issue when adding new slaves. Should I hotcopy just certain tables? or its not a good idea here?

I appreciate that you spending your free time to answer our questions, many thanks for that!

Re: sphinx

Posted: Fri Jan 12, 2018 2:39 pm
by diszk
I'm sorry but one more thing:
5) I'm creating my scj sites with vagrant+chef-solo so I can automate literally everything. Are there any command line things which can automatically bind a slave to a master? thanks again!