Force Left Nav To at least 200 Pixels wide
Force Body To at least 500 Pixels high
SharePoint MindsharpBlogs > Timothy Calunod > Categories

 Posts categorized as Power User

Apr 30
Published: April 30, 2012 22:04 PM by  Timothy Calunod

Up until this point, we have been working to recreate a solution that was partially out of the box with limited additional configuration needed so that we could reach the point of creating a specific solution for our Enterprise Search experience. And while it is true that the setup and configuration of Enterprise Search still needs to be performed in SharePoint 2010, the nature of setting up Enterprise Search through a Search Service Application (SSA) is not too far from the SSP setup that would have included the Search service in the first place. The only things we needed to do additionally, which were also identical to SharePoint 2007, was the creation and configuration of the Content Sources as detailed in my previous post. Now that we have arrived at the same starting point as the Collaboration Portal from SharePoint 2007, we can examine the additional customization needed to create our targeted Search solution.

Scoping Searches

As we have seen with SharePoint Foundation Search, Search is limited to only the Site Collections in which they are performed, allowing a search query to run in relation to any content within that Site Collection from the Site and below where the search is performed. In some ways, Enterprise Search in the previous version out of the box worked like this, since Context Search was added at that time. However, Context Search does not go beyond the Site Collection and thus cannot be considered Enterprise since it cannot touch other Site Collections, much less content not included in a SharePoint Site.

To achieve a more global search set, the inclusion of an indexing service and associated content sources is required to allow SharePoint to communicate with any content and thus return truly Enterprise Search results. Thus by default, a search query returns results from any crawled content, and is displayed through several possible interfaces, which include the Search Center Sites. In this regard, a Site Collection must be configured to determine which Search Center Site will display results as well as submit the queries. Since this is not configured in most Site Templates in SharePoint 2010, this must be set or the result sets will never return content outside of the Site Collection. This sums up what we have achieved thus far in our solution.

However, Enterprise Search is not limited to specific content like it is with Context Search, so long as the content is crawled by SharePoint. This means that more specific relevance may not have much impact on the search results if perhaps we were only looking for content in a specific repository, such as a specific Site or File Share. To configure this, we need to create a Search Scope. In essence, a Search Scope is a logical division of a Search Content Database configured to perform queries against less than the entire corpus of the searchable content.

Site Collection Search Scopes

There are, in effect, two configurable levels of Search Scope: Global and Local. Global refers to all Sites in a Farm (or connected Farm), while Local refers to a specific Site Collection. Although Search Scopes can be configured at either level – one through Central Administration or one through Site Collection Administration respectively – both scopes are managed by the SSA as far as publishing, rules, updating and availability. Global Search Scopes are useful when a particular logical division, such as by a specific type of content, is required or applicable across all Site Collections in the Farm, but the more granular nature of a Local Search Scope allows the Site Collection Administrator control over how that group of Sites will use a logical limitation that may only apply to that group of Sites. If this were a business unit, a branch location or even an internal only type of function, the ability to choose availability of Search Scopes can be beneficial.

TargetedSearch-Scopes-CentralAdminView

One of the main reasons why the Search Scope is valuable, however, is inherent in the purpose of the Search Scope, which is to limit what will be searched within the arrangement of the available content. Search Scopes allow this limitation through several configuration options, all aimed at tightening the view of the content that will be searched. The most common is by Content Source, but this can also be configured to the URL level and thus as broad as an entire host or domain to a specific site or folder. This configuration, called a Scope Rule, allows for a specific, almost topic- or area-related scope to be applied to search queries. In our example, by configuring a Search scope with a Scope Rule that only searches the specific Document Center where the promotional data is located, we can pinpoint where the search will be run against without having to touch other content sources.

The key differences between a standard Contextual Search and a Search Scope allows the granularity of focusing only on a specific data source rather than the Site Collection as a whole. This also produces a targeted effect anywhere a search needs to be run, regardless of the Site in the Site Collection. This feature is applicable to our solution as a Site Collection Search Scope however, because it limits the availability to our targeted Portal rather than being a globally available limit. If the solution called for a specific targeting type not limited to a group of Sites, using a global Search Scope would be appropriate, but as this Site Collection was used specifically for the purpose of housing and accessing this specific data, the Site Collection level works best.

Configuration

We need to complete several steps to configure our Search Scope for our Organizational Portal to target the Document Center where all the location-specific data will be searched.

1) Create the Search Scope

Through the Site Collection Administration page we create the Search Scope by simply naming the Scope, then configuring a few additional areas.

TargetedSearch-Scopes-New

Display Groups

Aside from a Title, we also need a Display Group and a Target Results Page. The Display Group organizes Search Scopes by a type that shows in the drop-down options depending on where the Search Scope is logically grouped. A Search Scope grouped into a set can be used when configuring how search queries will be run by selecting the Scope for the type of search that is expected to be performed. In most cases, the Search Dropdown will be the most likely group as it is the default, but others can be created for additional targeting limitations, such as certain types of content only within certain branches or divisions. Display Groups can only be configured through the Site Collection Administration page governing the Search Scopes, ensuring that these groupings apply within a Site Collection.

As a Display Group is created, or a new Search Scope is created if the Display Group already exists, the Search scope can be paired with the proper Display Group by either means. A Display Group can further define which Search Scope will act as the default scope, but all Search Scopes within a Display Group can be giving a position, indicating its important, relevance or focus based on how far from the top the Search Scope is available. A Search Scope can be paired with more than one Display Group, if it applied. At times, with reasons such as types of content or location of content, a Search Scope could have a many-to-many relationship as well as a one-to-many relationship, although these types of configurations need to be logically organized to make sense to the user or it will lose its application and perhaps even confuse the user further.

We will discuss the Target Results Page at a later time.

TargetedSearch-Scopes-DisplayGroup

2) Define the Display Group

By either method – Creating a New Display Group or Creating a New Search Scope if the Display Group is available – we can choose which drop-down will be tied to the Search Scope. Since we want to perform standard Enterprise Searches but also use a targeted search, a Display Group indicating where the Search Scope will be used will be created to indicate how that search will run. By creating a one-to-one relationship between Display Group and Search Scope, we can eliminate confusion over how the Search Scope will work. Note that at least one Display Group must be chosen for the Search Scope to surface and be usable.

Search Scope Rules

3) Create the Search Scope Rule

Once the Search Scope is created, a single or multiple set of Rules needs to be created for the Search Scope to be properly limited. The Search Scope will not be available until Rules are added, although only one Rule is necessary.

TargetedSearch-Scopes-AddRules

Each of the types of Rules focus on how the content will be divided, such as by Web Address or by Property. However, from within a Site Collection, one Rule type that focuses on Content Source cannot be chosen, although this can be configured through Central Administration. In our scenario, the Web Address Rule will apply as we want our Search Scope to be limited to only the Document Center within our Portal. Since the Web Address options are very specific, we need the Folder level to indicate our Document Center Subsite in our Portal Site Collection. Also, the ability to Include, Exclude or Require is also important, as each type indicates an OR, AND or AND NOT operator Rule respectively. This allows for multiple rules to be properly joined or excluded as needed. Since we do not need to configure additional Rules, the default of Include will suffice.

TargetedSearch-Scopes-ScopeRules

Once a Rule type is chosen it cannot be changed, although it can be deleted and recreated.

4) Publish the Search Scope

Rules can be changed, removed or added at any stage, but each change, including creation of the Search Scope with at least one Rule, requires that the Search Scope be processed by the SSA upon completion. This process typically takes about 14 minutes but can be processed sooner through Central Administration.

TargetedSearch-Scopes-ScopeWaitingUpdate

TargetedSearch-Scopes-Update-CAdmin

Once the Search Scope has been properly configured, it can now be bound to a Search Center through the use of Web Parts, Targeted Pages, and Search Settings.

Observation

Because this behavior of grouping and logically limiting search queries is not automatic behavior, there are several steps involved with configuring a working, usable targeting system that will allow for the right level of focus as well as application. By creating a Search Scope with a single rule to focus only a the Document Center, and then grouping that Search Scope into a single Display Group, all search queries that run through that Search Scope Display Group will only yield results from our targeted location, effectively creating and focusing the targeting requirement of our solution. And while this particular solution did not require it, some thought and planning should go into the creation of Search Scopes, Search Scope Rules, and Display Groups, to optimize the Enterprise Search experience and not confuse or detract from the Search Application.

One last customization needs our attention, however, which is how the Search Center, the Search Settings, the Content Source, and the Search Scope all come together. This will be the topic of the next blog post.



Mar 31
Published: March 31, 2012 00:03 AM by  Timothy Calunod

In my previous post, we began a configuration process to reproduce a simple targeted Search solution in SharePoint 2010 that was once configured in SharePoint 2007. The array of differences in the configuration were much more plentiful than I expected, and thus required a little more work than the original setup. However, the process can still bring us to the solution despite the changes that made the recreation more complex.

We began with setting up the simple content architecture with a Web Application and a four-Site Site Collection, and proceeded to create the Search Service Application (SSA) needed to meet our Search Service requirement. At this point, the configuration of the Content Sources, the Scope for the target and the custom Web Pages in the Search Center still need to be completed, all of which we will continue with here in this post.

Content Sources Overview

In order for any content to be discovered through a Search query, a Content Source must be created and configured. This Content Source defines what network endpoint SharePoint will identify as a location of stored content and uses Hosts, also known as Start Addresses, to define what points in that network source SharePoint will crawl. Additionally, file types must be defined for SharePoint to identify, by a simple file extension, what is considered content by both the user and by SharePoint. Furthermore, SharePoint requires a special interpreter called an Index Filter (or iFilter) to open and understand how to read the content in the designated content node so it can create the entries in the Index Partition and other databases. Thus the Content Source is heavily used by the Crawl Component to find and index content for search usage.

Content Sources in SharePoint 2010 are similar to SharePoint 2007, but now a new Content Source communication system has been included. Previously, all Content Sources used Protocol Handlers to communicate with a network endpoint and traverse the system to identify content in that system, referred to as Content Nodes. Now, some Content Sources, such as the ones that connect to third-party content system databases such as Lotus Notes or Documentum, use the newer system of the Connector Framework. This Indexing Connector is built upon Business Data Connectivity Services, and allows SharePoint to crawl, enumerate, and create local indexes of the targeted database content. The Indexing Connector, regardless of being Protocol Handler or Connector Framework, is required for the Crawl Component to use the configured Content Source, but by picking the type of Content Source, the proper Indexing Connector is chosen as well. Thus content can essentially be grouped by its native Indexing Connector, such as File Shares or Exchange Public Folders.

When building a Content Source, once the Indexing Connector type has been chosen (again by choosing the Content Source type), the host addresses that will be crawled must also be determined. These addresses, also known as Start Addresses, include the entry point for the Index Connector via the URL, UNC, or other method used by an Indexing Connector (such as a Business Data Connectivity Model) and a crawl depth configuration specific to Indexing Connector type that allows SharePoint to crawl subfolders, other servers, and the like. And while the software boundary of a Content Source is 500 Start Addresses, the recommended threshold is 100. This means planning the starting points for the crawls should be high enough in the target system to crawl everything expected to be indexed from that system.

Also, each Content Source can have and should have a Crawling Schedule, to allow the Crawl Component to regularly visit the target system to update indexes and to keep the source of content queries as fresh as the organization is comfortable to have. The key tradeoffs reside in how much processing from the Crawl Component and the target system can be borne and still be acceptable for daily working performance from either system and the amount of content expected to be crawled. This planning point usually results in multiple Content Sources to reduce resource consumption by creating differing and staggered scheduling when performing crawls. Scheduling options include Full Crawls and Incremental Crawls. Full Crawls are usually required at the onset of the SSA’s deployment, and subsequently when configurations change or corrupted indexes need to be repaired, among other reasons. However, SharePoint will not perform any crawls unless triggered either manually or by schedule.

TargetedSearch-ContentSource-View

Configuration

In Enterprise Search, SharePoint has always had a self-awareness of content that it managed regarding indexing. A standard, automatically created Content Source called Local SharePoint Sites is created when a new SSA is built, and it includes all Start Addresses for each Web Application associated with the SSA. This automatic configuration makes crawling SharePoint-native content much easier, and new Web Applications added are automatically updated as a host address as needed. For our scenario, since the SSA did not exist before creating the Web Application, we do not need to set up this host address. However, to insure that our targeted application is focused on the particular content in question, a different location of duplicate content will be created using the File Shares Content Source.

1) Create a Share

First, a file share named Workspace Content will be created that will host the same content as our Web Application. Again, this is only to verify that our targeted solution is actually looking at the source we are expecting, which will be the SharePoint Site-based content.

TargetedSearch-Content-FileShareView

2) Create a File Share Content Source

Next, we build a Content Source based on the File Shares Indexing Connector and set the appropriate Start Address by using the UNC of the file share

TargetedSearch-ContentSource-AltSource

3) A Full Crawl of the Content Source is needed to be sure we have the File Share content Crawled.

TargetedSearch-ContentSource-FullCrawlAltSource

4) Content for the Document Center Subsite of the Web Application is added.

TargetedSearch-Content-DocCenterView

The Document Center Site Template now includes an Upload a Document button, which triggers the single document upload dialog box and also allows for multiple content uploads. The same content stored in the Workspace Content File Share is also stored here. Also, the Document Center has only one Document Library by default called Documents, which will suffice for our scenario, but could include additional Document Libraries, Folders, and even Picture Libraries as it did in the original solution. In this case, we do not need to flesh out that level of organization to test the solution.

5) Create web page-only content for the Web Application

TargetedSearch-Content-PublishingSiteView

To insure that the targeted solution is also only looking at the Document Center Site, an additional Site was created to emulate the News Site from the previous version of SharePoint. You may notice that the Workspace Content File Share also includes HTML pages that would be used in web pages for the News Site, and thus after the Publishing Site was created as a Subsite, web pages for the Site content was also generated here. The purpose of this is to test how keywords in Search may show results from either the Document Center, the Publishing Site, or the File Share, since the point of the targeted solution is to focus on only the Document Center Site.

TargetedSearch-PublishingSite-WebPageView

6) Perform a Full Crawl for the Local SharePoint Sites Content

Since we need the content of both sites, the Document Center and the Publishing Site, included in our results set as this would be a standard expectation of an Enterprise Search, a Full Crawl against the Web Application needs to be run. Going forward, Incremental Crawls will be adequate, but for testing purposes we will update our index to be as fresh as possible.

Viewing Search

Once our configuration for Site content and crawled content has been completed, we can test that Enterprise Search is providing an accurate view of our content. This is important because SharePoint 2010 continues to support Context Search (sometimes referred to as SharePoint Foundation Search), which automatically indexes a Site Collection for its content and allows a user to search only within that Site Collection. By adding the File Share Content Source and creating the Enterprise Search Site, we have created an Enterprise Search that will go beyond the Site Collection and will behave as a standard Enterprise Search application should.

TargetedSearch-SearchResults-Comparison1

TargetedSearch-SearchResults-Comparison2

In the first screenshot, we see that content has come from both the Document Center Site and the File Share. We are also alerted to the presence of duplicate entries, as denoted by the View Duplicates link that shows under some of the results. In the second screenshot, one of the duplicate entries is expanded to show that content from the Publishing Site is also being displayed in results separately from the File Share. Thus we have an expected and very standard approach to an Enterprise Search application, emulating what would have occurred with less configuration from SharePoint 2007. Now that we have the groundwork set, we can create the targeted application solution.

Observation

Similarities between versions of SharePoint does not always mean quick and easy configurations going from previous version to next. Although many additional enhancements have been made to improve many portions of the experience and functionality, there are still additional tasks and planning points that need to be considered and performed to meet the need of a functional solution. And while some things, such as Context Search, provides a quick and touchless solution for simple search, when expanding to deeper or broader levels there needs to be more completed to service even something that may have been considered standard from the previous version.

In our scenario, taking Enterprise Search from scratch required many additional steps and planning options, and to reach the point where the custom solution was originally crafted, there was still plenty of work to be performed to reach that point. There are reasons why this approach can be useful, especially when considering a more centralized Enterprise Search experience, but it also means some processes from previous configurations may need to be well documented and thought out a bit more to move into the new format and structure.

At this point our scenario brings us to the customization of the solution, and considerations for the present system of Enterprise Search still need to be considered. These topics will be examined in upcoming posts.