Crawling Large Library`s in SharePoint 2007
I had been experiencing issues crawling a large document library of over 60,000 items in a SharePoint 2007 farm 64x after the index was corrupt and I had to reset the crawled content. The only error I could find in the crawl log was the error “The item may be too large or corrupt.” The crawler stopted around the 33,000 items from this document library. I have searched a lot on the internet for this problem and found a few Blogs describing this problem with different solutions. The solution for my issue was a mix of what I found on the internet. After these changes the crawler was able to index all 60,000 items from one Library.
Register changes:
- HKLM/SOFTWARE/Microsoft/Office Server/12/Search/Global/GatheringManager/DedicatedFilterProcessMemoryQuota” –> Change the value to: 256000000 Hex
- HKLM/SOFTWARE/Microsoft/Office Server/12/Search/Global/GatheringManager/FilterProcessMemoryQuota –> Change the value to: 256000000 Hex
- HKLM/SOFTWARE/Microsoft/Office Server/12/Search/Global/GatheringManager/FolderHighPriority –> Change the value to: 500 Hex
- HKEY_LOCAL_MACHINE/SOFTWARE/Microsoft/Office Server/12.0/Search/Global/Gathering Manager: set DeleteOnErrorInterval –> Change the value to: 4 Decimal
Search Time Out settings:
1. Central Administration -> Application Management -> Search section -> Manage search service
2. Manage Search Service page –> “Farm-level search settings
1 Comment
Leave a comment
Categories
- Blog (35)
- Configuration (9)
- Customers (7)
- MOSS 2007 posts (42)
- News (4)
- PowerShell (9)
- SharePoint 2010 posts (20)
- SQL 2008 (2)
- Windows 2008R2 (2)
Tags
Recent Posts
- iDocs steunt team 2Dakar
- Enable FILESTREAM on SQL Server 2008R2
- Featured blog @ Blogsearchengine
- SQL Maintenance Plan SharePoint 2010 databases
- PowerShell error: The local farm is not accessible. Cmdlets with FeatureDependencyId are not registered.
- Error starting a scheduled SQL 2008R2 Agent Job
- Access denied: Business Data Connectivity
- SharePoint 2010 Won’t Open PDFs on client
- SharePoint 2010 Error (in CoreResultsWebPart::OnInit) when searching
- SharePoint PowerShell Backup Script with Windows Scheduler
What I'm Doing...
- @BertPaarhuis @dennisme toch een Ducati: http://t.co/k3nFPSEF <-- een monster en italiaans, niks meer aan doen! in reply to BertPaarhuis 1 week ago
- I liked a @YouTube video http://t.co/YnygoK5R Hagenbeck dierenpark tierpark Hamburg 2012 2 weeks ago
- I liked a @YouTube video http://t.co/wAfst8PH Legoland Highlights Miniland - Billund - Denemarken 2012 2 weeks ago
- @2dakarteam dat gaat goed! Weer twee punten van het lijstje af. ;) in reply to 2dakarteam 3 weeks ago
- @FeroZandhappers een seeed grps shield icm fez panda II. http://t.co/5TrTLhGX 3 weeks ago
- “@2dakarteam: Paarhuis is druk bezig met een Google Maps wrapper, die de GPS info op de website plaats...” <-- altijd handig! :) 2012-04-19
- I liked a @YouTube video http://t.co/YCL0hJhf Raccoons eating fruit 2012-04-11
- @BertPaarhuis @KoosPaarhuis @havelterfeest @MarjoleinSoer <-- Have fun!! in reply to BertPaarhuis 2012-03-31
- @2dakarteam Eerste dozen verband, gaasjes, etc. zijn binnen voor de med. kliniek in Fatoto. @MarjoleinSoer @AnniePaarhuis bedankt! <-- Super 2012-03-28
- Building a webapp in #php long time ago. But all for a good cause. #2dakar ;) 2012-03-23
- “@2dakarteam: Opweg om de kniptang in de kabelboom te zetten en het imperiaal te verwijderen (sorry Ronald)...” lang leve de ductape!! ;) 2012-03-19
- @2dakarteam de velgen worden zo steeds duurder. Briljant! :-) in reply to 2dakarteam 2012-03-19
- @JeffNagelNow thanks!! in reply to JeffNagelNow 2012-03-17
- Dagje sleutelen in Deventer #dakar 2012-03-17
- “@BertPaarhuis: Op naar de SharePoint World Tour bij MS in Amsterdam...” <- Have fun! ;) 2012-03-13
- @2dakarteam echt super! Dit is pas het echte motor gevoel. Ja 1000 km inrijden. in reply to 2dakarteam 2012-03-11
- Zo, eerste 150km staat alweer op de teller. #moto #guzzi. 2012-03-11
- My lucky number 7 :-) #Moto Guzzi V7 Racer. http://t.co/XrvHmwyC 2012-03-10
- Pick up my new motor today, a #Moto #Guzzi V7 Cafe-Racer with an Arrow exhaust. Saying goodbye to my #Triumph Daytona. http://t.co/AljuZiTP 2012-03-10
- India 2012 - New Delhi, Jaipur & Haridwar Ganges River: http://t.co/Pn9CjKRJ via @youtube 2012-03-07
- More updates...
Posting tweet...








[...] New iDocs blog post : Crawling Large Lybrary… http://www.idocs.info/index.php/2011/01/crawling-large-lybrarys-in-sharepoint-2007/ #in [...]