SharePoint User Group UK

Share the knowledge!

Welcome to SharePoint User Group UK Sign in | Join | Help
in
Home Blogs Forums Photos Files Roller

Problems Crawling a DotNetNuke Content Source

Last post 05-16-2008, 11:15 AM by Ubersnug. 8 replies.
Sort Posts: Previous Next
  •  04-05-2008, 3:46 PM 9580

    Problems Crawling a DotNetNuke Content Source

    Hi folks

    I'm having a little trouble getting MOSS 2007 to crawl all content on an IIS based web site which has been developed using DotNetNuke.  Basically, a lot of content urls aren't getting crawled and for each one the crawl log reports

    "The object was not found. The item was deleted because it was either not found or the crawler was denied access to it"

    In each case the url is a separate DotNetNuke page tab with a url which looks like this :

    http://www.domaintosearch.co.uk/subsite/section/tabid/202/default.aspx

    (the number following the tabid will be different for each tab of course).

    So I was wondering if anyone else has seen this and where the problem might lie.  Could it be something on the content source pages preventing crawling?  Or something specific to DotNetNuke?

    Thanks

    Derek

  •  04-07-2008, 9:40 AM 9590 in reply to 9580

    Re: Problems Crawling a DotNetNuke Content Source

    Have you checked the event logs for fruther errors relating to this error occuring and does the content crawler account have rights to the DNN site and data?

    Andrew Carter
    .Net & SharePoint
  •  04-07-2008, 10:06 AM 9592 in reply to 9590

    Re: Problems Crawling a DotNetNuke Content Source

    Hi Andrew

    Nope, nothing else in the event logs.  I can happily crawl other public facing web sites using this crawler account, but this morning I though I would see what happens if I try crawling other DotNetNuke public facing sites. 

    So I went to their showcase page at http://www.dotnetnuke.com/default.aspx?tabid=541 and tried crawling some of the others (eg WineAustralia and EarSinus) and got exactly the same errors.

    So I'm wondering if this is just specific to SharePoint and DotNetNuke?  Or something else I need to configure?

    Derek

  •  04-07-2008, 11:04 AM 9593 in reply to 9592

    Re: Problems Crawling a DotNetNuke Content Source

    mmm, try turning up the diagnostic logging to see if this sheds any more light on whats happening:

    Central Administration' -> 'Operations' -> 'Diagnostic Logging' -> 'Event Throttling', then select 'MS Search Indexing', 'Error', 'Verbose'

    Andrew Carter
    .Net & SharePoint
  •  04-07-2008, 11:48 AM 9597 in reply to 9593

    Re: Problems Crawling a DotNetNuke Content Source

    I just tried that, and did a couple of full crawls, but don't see any 'MS Search Indexing' items in the log file.  I also tried setting the least critical event to 'Warning' and crawling again but still no joy.

    And it's definitely only DotNetNuke sites I'm seeing the issues with.  When I can get access to the DNN server I'm going to see if there's anything there which might be causing the problem.

    Derek

  •  04-10-2008, 9:26 AM 9700 in reply to 9597

    Re: Problems Crawling a DotNetNuke Content Source

    A quick follow up on this.  I finally managed to get access to the DotNetNuke web server and did an experiment.  I created a new web app and put some standard ASP.Net and HTML pages in there then did a crawl from MOSS.  This worked as expected without any problems.

    So the issue I'm having is definitely related to content managed and generated by DotNetNuke.

    If anyone has any other ideas, or has seen the problem before, I would very much like to know.

    Thanks

    Derek

  •  05-16-2008, 9:10 AM 10784 in reply to 9700

    Re: Problems Crawling a DotNetNuke Content Source

    On the off chance you haven't I wonder if you have configured your content source to crawl complex url's?

    Under the search settings -> Manage Crawl Rules -> new crawl rule

    Just type in the host name for the website you are wanting to crawl and postfix it with an '*'. For example:

    http://www.domain.com/*

    Also, under Crawl Configuration, select the 'Include all items in this path' and tick the 'Crawl Complex URLS(URLS that contain a question mark(?))'.

    Then just specifiy what account you want to use to crawl the URL and click on 'OK'.

    You may have already done this already, but just on the off chance you hadn't.

  •  05-16-2008, 10:10 AM 10792 in reply to 10784

    Re: Problems Crawling a DotNetNuke Content Source

    Well I did have a Crawl Rule set up but I hadn't checked complex URLs.  I've just checked that and done a full crawl and it seems to have done the trick.

    Many Thanks

  •  05-16-2008, 11:15 AM 10809 in reply to 10792

    Re: Problems Crawling a DotNetNuke Content Source

    No probs, glad to help!

View as RSS news feed in XML
Powered by Community Server, by Telligent Systems