Hello Rainman, since i do not know how to draw flags I'm asking you, if you could draw the '91-'95 Macedonian flag where the sun rays are of the same length. I have found just a small version. could you draw the flag acoording to the other flags of wikipedia? thank you Korpas (talk) 15:38, 6 January 2009 (UTC)[reply]
modification to search several Wikipedian sections at one time
rainman, I just got your message, and wanted to thank you! I also had some questions. Is the software updated on a regular basis? and is the search open software? or can it at least be viewed? Thanks. --stmrlbs|talk01:30, 16 June 2009 (UTC)[reply]
This is fantastic! and so quick! Thanks for the other information about the search, and I will be sure to update the documentation after I try it. I've written enough software documentation to choke a horse, so I will be glad to do that. --stmrlbs|talk17:36, 16 June 2009 (UTC)[reply]
rainman, what are the differences in the way wikipedia searches and Google does? I figured since you've worked in this area, that you would be more aware of the differences. People think they are going to find all instances of a word on the net with a google search, but that is not always the case. You still sometimes have to do a separate search to find all instances of something on wikipedia, whether it is a google site search or a wikipedia search.
examples:
google search for horticulture - 1 result(found on wikipedia)
Yes, well I didn't do any systematic research into this, although would be interesting. One thing I noticed is that sometimes the page is just absent from the Google index, and while certain words act as keywords for article, other just don't. For instance, for most of 2008 Google was blind to this article simple:Douglas Adams. Whatever you did, edited the page, tried different searches, it was just not in the index. And you already found a case with external links when Google is blind to that part of the text, but this can happen to other parts of the text as well. There is either a very sophisticated algorithm that deems these words useless, or there is some kind of a temporary failure in google's internal architecture that makes the results not show up. --rainman (talk) 09:41, 1 July 2009 (UTC)[reply]
hmmm.. that's interesting. I wonder what other pages are left out. It would be interesting to run a google search of a common term against a wikipedia search of that same term, save the results, and run a utility compare to see the differences.
I have found that google does not necessarily index every page on a site, even though they "index" the site. A friend of mine had an open forum. He wanted to use the "site google search" facility instead of the search that came with the forum software for performance reasons. But, google would only index about 10% of his site. I could not figure out why.. the forum changed regularly (with new posts), and the subjects were interesting. I knew other forums with the same software that were immediately indexed. He didn't have ads or link farms, or anything like that. But, google just didn't want to index his forum for some reason. The only posts that got indexed were those posts linked to from somewhere else.
rainman, I've always used a couple of different searches if I really need to find something that is a bit obtuse. There used to be more differences between the different search engines.. they are getting more homogenous as time goes on. But there are still differences. I think a lot of people think google finds everything, but it does more decision making as far as relevance, and depending on what you are looking for, it can totally miss it, if it is a webpage, or part of a web page deemed as "irrelevant". I notice how google was blind to external links when I was trying to find all articles which linked to a defunct website. It gave me back no results when I knew there were many articles that had this link. I like google, and think it is one of the most user friendly search engines - but I know that sometimes I can't depend on it to find everything. --stmrlbs|talk20:16, 1 July 2009 (UTC)[reply]
Stmrlbs, I just noticed the Google searches you propose above aren't of Google searches of Wikipedia alone. Using a Google search of only Wikipedia, you'll get 4,350 hits for "horticulture". That's why I love Google Toolbar, which as a button for searching websites. One time I was searching for a name and only a few sites turned up. I then searched using the website search button. Several times I've experienced that Google can thus access information on locked websites, IOW it bypasses the login security feature. This time it found the name, but very deep in the website where pot growers in British Columbia discussed their growing methods, how to smuggle it across the border into the USA, etc. Everyone was on a first name basis. Obviously the name I was searching for was someone I didn't know. I then got curious and "backed out" of the website and found no other content. Even the index page was blank. This was apparently their "secret" meeting place, but Google found it. Anyway, I hope that information about how to use Google to deep search one website is helpful. -- Brangifer (talk) 14:21, 1 July 2009 (UTC)[reply]
that's not good. Google ignoring security requirements. It sounds like the site didn't have the login security properly set up (I hope that was the reason - rather than just bypassing security). The problem with Google indexing areas like this is that as thrilled as you were that google bypassed security, this kind of thing works both ways. Do you want some person googling and finding your bank history online, or something like that - because google is ignoring security? But, this doesn't sound like something google would do intentionally. I hope not.
As for my example, the reasons given in the RFC for not using wikipedia search is that you can find all the results on multiple websites with one fell swoop of google. I showed with these examples that this is not the case, that if you want google to find all the instances of a word on a website, you will still have to do a site search of that site. So, that is multiple searches, not one search.
as for the 4,350 results returned for a toolbar site search of wikipedia, my google site search - using the advanced options of the regular google search - for horticulture in wikipedia returns 6,480 results!! The Wikipedia search for horticulture returns 6,015 results. So, there are differences between the regular google site search, the Wikipedia search, and your googlebar site search, with your google bar search turning up the fewest results.
So, which is the best search? Imo, even though I am definitely for anything to improve User friendliness, the ease is not the only factor in what makes a search a good search. With the algorithms being for the most part hidden in the popular search engines, sometimes it is hard to figure out what results are being dropped and for what reasons. Rainman has worked internally on the wikipedia search, and therefore, I thought he would be the best person to ask about those differences. If possible, I would like to document some of the differences so that people can make better decisions as to which search to use - depending on what they are looking for. Regardless of what comes out of the RFC.
Unfortunately I only have anecdotal evidence, although it would be interesting to do more in-depth analysis, e.g. by using all wikipedia and all google hits for keywords and seeing in which articles they differ. Comparing numbers of hits can be tricky since wikipedia search gives exact numbers, while google gives an approximate number, so they are not really comparable for large number of hits. --rainman (talk) 00:49, 2 July 2009 (UTC)[reply]
yes, just comparing straight numbers wouldn't be enough. But if there was a big difference in the numbers, then I would think it would indicate something. Probably the fact that google doesn't check the external links accounts for a lot, but if there was a way for the wikipedia search to turn off external link checking (or checking of any section with a certain name), then it would even it out a bit, and allow checking for other big discrepencies. But, I realize this isn't exactly high priority on anyone's list.. but I'm curious what the differences are now. --stmrlbs|talk01:32, 2 July 2009 (UTC)[reply]
just wont to let you know that something is wrong with the new search
I like the changes. Are they documented anywhere? Are there any changes that are not apparent on the new search page? --stmrlbs|talk19:42, 2 July 2009 (UTC)[reply]
rainman, does Wikipedia have a Sitemap defined for the major search engines? Is it defined somewhere? I was wondering how the User Pages are defined (as far as priority) currently in this sitemap - if there is one. --stmrlbs|talk19:45, 2 July 2009 (UTC)[reply]
We used to have some sitemaps, but I think no search engines were actually using them. I would imagine any serious search engine would have a special module just for parsing and updating wikipedia, so our unreliable sitemaps would very probably be ignored anyway. --rainman (talk) 11:14, 3 July 2009 (UTC)[reply]
No idea, I have a vague recollection of it being removed at some point, prolly should search through fixed bugs in bugzilla.wikimedia.org and in code commits on mediawiki.org. --rainman (talk) 14:42, 3 July 2009 (UTC)[reply]
Hi, can you have another look at WP:VPT#Search page (if you're not watching it anyway)? We really need a way (as I think I mentioned before) of adding help links on the search results page. (Normally things like this are done through a MediaWiki: page, so that each project can customize its own interface, but in that thread you say there is no such page in this case.)--Kotniski (talk) 16:58, 30 October 2009 (UTC)[reply]
thought: if nothing at all matches the search query, MediaWiki:Search-nonefound is shown in addition to MediaWiki:Searchmenu-new. But the latter message refers to "checking the search results below". Perhaps Search-nonefound should be shown instead of Searchmenu-new, with the "create page" option added there? Rd232talk16:10, 1 November 2009 (UTC)[reply]
rainman, why is it necessary to have to enter the exact title of an article - exact including punctuation - in order to find an article in any of the logs? Especially the Deletion log? Why can the search not find a keyword in the title? That would be so much more user friendly. You will see a lot of people asking "what happened to my article" because they can't find it in the deletion log. I couldn't find a title because I searched for "Searching for the Wrong Eyed Jesus" [1] instead of "Searching for the Wrong-Eyed Jesus" [2]. Ack! Thanks for any help you can give. stmrlbs|talk03:06, 17 November 2009 (UTC)[reply]
Thank you, rainman. I submitted a bugzilla report - hope I did it correctly. The number is 21555. (I am not asking for a commitment on your part - this is just to let you know if you want to take a look). stmrlbs|talk03:15, 18 November 2009 (UTC)[reply]
Thanks for your note at VPT indicating that the indexer had stopped, and would be restarted. I'm thinking that hasn't happened, and I'd like to explain why I think this, so you can let me know if my thinking is off-base.
I first assumed that the indexer needs to make sure every new page title is in the database, so it would crawl through the new page list chronologically. Then I realized that the indexing is full text, so it has to reflect any change to a page, so perhaps it crawls through the recent change list, of which the new pages are a subset. In either case, my starting assumption is that, at any point in time, there is an earlier point in time such that all changes prior to that time are indexed, and all changes after that time are not yet indexed. I opened the New Page file, did a binary search looking for titles recognized, and identified Reynoldston, New York, created at 22:57, 19 May 2010, as in the database, and the next created file Blanket High School, created at 23:46, 19 May 2010, as not in the database. I did that check Monday evening. Tuesday morning, I checked again, saw that Blanket High School was still not indexed, and surmised that the indexer was not doing anything.
You confirmed my guess, and said it would be restarted.
However, this evening, I just checked again and it is still not the case that Blanket High School is indexed. It occurs to me that perhaps with a restart, it doesn't pick up where it left off, so maybe it is indexing away, but in a different section of changes. However, you can imagine that my first guess is that the indexer would start exactly where it left off, so the fact that it still hasn't indexed Blanket High School leads me to wonder if it really is in action.
I hope I'm not being too much of a pest, but I am working closely with a new editor who has created a fine new article, Terrain Gallery, and I'd like to report to the editor when the file is indexed.
Is it possible that the indexer is not yet working, or is it the case that my approach to testing this is flawed?--SPhilbrickT22:43, 25 May 2010 (UTC)[reply]
The Search Index has not been updated since 12 August - i.e. 6 days ago.
I have reported that here but nothing has happened. In April you resolved this, as reported here. Any chance you can look at this again? or tell me who to contact? - Thanks Arjayay (talk) 07:59, 18 August 2011 (UTC)[reply]
On 18 October, I explained an increasing number of problems with search results on the Wikipedia:Village pump (technical) page. Briefly:-
The number of matches varies, and matches disappear
Some matches only show the article title, with no detail
Matches move about
False matches
You replied with this diff [3]
including The problem was with search9 which had a stale version of one of search index slices.
The problem has returned. A search for "refered" [4] alternates between 11 matches, 6 of which I corrected 6 days ago (5 are in URLs etc so cannot be corrected) and 12 Matches which I have just corrected the 7 new cases.
I suspect there is another "stale ... search index slice" - whatever that means. Could you look at this again please.
Arjayay (talk) 18:39, 9 November 2011 (UTC)[reply]
Search results are not including changes I made on 25 February, whilst refreshing a search generates different selection of results. Looks like the old "stale version of one of search index slices" again.
Could you look at this please. Thanks - Arjayay (talk) 19:27, 29 February 2012 (UTC)[reply]
Sorry, it's me again.
After a couple of days of random search results (different numbers of matches when pressing refresh - usually typical of "stale slices") the search index has stopped updating; causing a backlog for us Wikignomes. I have reported this at Wikipedia:Village pump (technical) but usually get a quicker response by messaging you directly. Thanks - Arjayay (talk) 16:15, 7 September 2012 (UTC)[reply]
Hi Arjayay,
Search indexing was stopped for a while in the past 24 hours. I had to turn it off so that I could migrate data from one of our data centers to the other. Sorry for the inconvenience! It should be back up and fully up to date now. If the problem you were expereincing is still present, please let me know! Peteryoungmeister—Preceding undated comment added 16:33, 7 September 2012 (UTC)[reply]
Thanks for the message - the index does seem "stable" now i.e. the same number of matches, in the same order, on each refresh. It is not totally up to date - i.e. spelling mistakes corrected 24 hours ago still appear in lists of misspellings - hopefully this will correct overnight? (I'm in the UK = UTC+1} - Arjayay (talk) 16:49, 7 September 2012 (UTC)[reply]
Hi. Thank you for your recent edits. Wikipedia appreciates your help. We noticed though that when you edited Förster resonance energy transfer, you added a link pointing to the disambiguation page Dipole moment (check to confirm | fix with Dab solver). Such links are almost always unintended, since a disambiguation page is merely a list of "Did you mean..." article titles. Read the FAQ • Join us at the DPL WikiProject.