<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Library &#38; Information Services &#187; solr</title>
	<atom:link href="http://sites.middlebury.edu/lis/tag/solr/feed/" rel="self" type="application/rss+xml" />
	<link>http://sites.middlebury.edu/lis</link>
	<description>We Bring Knowledge to You</description>
	<lastBuildDate>Wed, 22 May 2013 12:09:41 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>DrupalCon 2010 Trip Report &#8211; Day 3</title>
		<link>http://sites.middlebury.edu/lis/2010/04/26/drupalcon-2010-day3/</link>
		<comments>http://sites.middlebury.edu/lis/2010/04/26/drupalcon-2010-day3/#comments</comments>
		<pubDate>Mon, 26 Apr 2010 14:07:50 +0000</pubDate>
		<dc:creator>Ian McBride</dc:creator>
				<category><![CDATA[LIS Staff Interest]]></category>
		<category><![CDATA[Areas and Workgroups]]></category>
		<category><![CDATA[Conference Reports]]></category>
		<category><![CDATA[conferences]]></category>
		<category><![CDATA[Drupal]]></category>
		<category><![CDATA[Enterprise Applications]]></category>
		<category><![CDATA[Nutch]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[RDF]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[solr]]></category>
		<category><![CDATA[Web Application Development]]></category>

		<guid isPermaLink="false">http://sites.middlebury.edu/lis/?p=22892</guid>
		<description><![CDATA[After attending a conference, I usually think, &#8220;Wow, we&#8217;re so far ahead here at Middlebury!&#8221; Not this time! DrupalCon was incredibly helpful in demonstrating all of the ways we can improve our site with better performance, better search, better content, &#8230; <a href="http://sites.middlebury.edu/lis/2010/04/26/drupalcon-2010-day3/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>After attending a conference, I usually think, &#8220;Wow, we&#8217;re so far ahead here at Middlebury!&#8221; Not this time! DrupalCon was incredibly helpful in demonstrating all of the ways we can improve our site with better performance, better search, better content, and better code. I&#8217;m also really excited about the upcoming release of Drupal 7 and both confident we can move our site onto this new version and eager to use all the new features.</p>
<p>Here are the highlights from the last day:<span id="more-22892"></span></p>
<ul>
<li>All of <a href="http://sf2010.drupal.org/conference/sessions">the conference sessions are now available to watch online</a>.</li>
<li>Librarians might be particularly interested in <a href="http://sf2010.drupal.org/conference/sessions/shh-drupal-powered-library-site">Shh! This is a (Drupal-powered) Library Website</a>, though I suggest skipping the first 27 minutes as those presenters cover topics not relevant to our setup.</li>
<li>The tech team at the White House is contributing back to the community the work they&#8217;ve done to make the Drupal perform better under high load and the NY State Senate is working to provide an out-of-the-box Drupal configuration for State governments. Whatever your personal politics, I hope you&#8217;ll agree that it&#8217;s neat to see the government using free resources and then improving them for the rest of the community.</li>
<li>The Node Access module in Drupal 7 includes a feature that will allow us to remove the &#8220;core hack&#8221; from Monster Menus, which will allow it to be an accepted module in the Drupal community and help it get adopted by other Drupal users.</li>
<li>Drupal 7 allows you to build form fields with &#8220;states&#8221;. For instance, a group of field for taking credit card information could have the state &#8220;expanded&#8221; if a checkbox labeled &#8220;pay online&#8221; is checked. This will help us build easier to use interfaces.</li>
<li>We can combine the <a href="http://lucene.apache.org/solr/">Apache Solr</a> project with the <a href="http://lucene.apache.org/nutch">Apache Nutch</a> project to create a local search crawler and indexer, like the Google Search Appliance, but with a lot more room for us to expand and a lot less configuration to provide <a href="http://facetedsearch.davidlesieur.com/faceted_search">faceted search</a>.</li>
<li>The next database abstraction layer for Drupal 7 uses PHP&#8217;s PDO library, which will support MySQL, postgres, SQLite, MSSQL, and Oracle as database back ends. There are also huge improvements in database replication, allowing us to have a true &#8220;hot standby&#8221; server, support for prepared statements, transactions, a query builder, and a lot of other stuff that our Banner programmers take for granted.</li>
<li>While still in its very early stages, there are a number of Drupal projects working on support for various NoSQL platforms, including MongoDB. These systems promise to improve performance by removing some of the technical limitations imposed by storing information in &#8220;tables&#8221; in databases. Not quite ready for wide use in production, but expect to see a lot more of this in a few years.</li>
<li>Much of the HTML code printed by code modules in Drupal 7 will use RDF markup. This provides additional information about the elements on the page that are intended for the browser and search engines to use. For example, the text &#8220;Price: $9.99&#8243; won&#8217;t get picked up by a search engine as a price, but &#8220;Price: <code>$&lt;span class="field-item even" property="product:listPrice"&gt;9.99&lt;/span&gt;</code>&#8221; allows the search engine to display that information as a price next to the page listing.</li>
</ul>
<p>Read on for more notes on each of these points.</p>
<p><!--more--></p>
<h3>Node Access in Drupal 7 (Notes by Ian)</h3>
<p><a href="http://sf2010.drupal.org/conference/sessions/node-access-drupal-7">Watch the presentation</a></p>
<p>In the current version of Drupal, changes to the user permissions to access a piece of content can only truly happen through the core of the software, meaning that our custom modules to Drupal are limited in what they can do to control access. With Drupal 7, we get a new hook, hook_node_access(), that allows any module to define CRUD permissions (Create, Read, Update, Delete), on any node and for any other module to then modify those permissions. I&#8217;ll back up for a second and explain that when a Drupal page loads, the software core gets first crack at doing whatever it needs to do and then all of the modules get a change to modify it, using hooks, in an order defined by the site administrator. Any module that implements hook_node_access() will run that function on all nodes and, if it is the last module to implement that function, will have the final say on who can do what with a node.</p>
<p>This is very important for us, because this is exactly the type of thing Monster Menus needs to do. MM introduces &#8220;pages&#8221; to Drupal, which the core of the software doesn&#8217;t know anything about. In MM, all nodes are assigned to a page and then permissions are set on the page. So in Drupal 7, MM will be able to run hook_node_access() to tell the system, &#8220;these nodes being loaded right now belong to this page and here&#8217;s who has each of these permissions on them&#8221;. Right now this is all done with a large amount of heavy lifting that taxes the system. The hope is that, by opening up access to these functions, Drupal 7 will improve site performance and let us do more.</p>
<p>There are some costs to doing this. The query builder in Drupal 7 doesn&#8217;t know that things need to be run through the node access logic, so it is a requirement that you add a tag of &#8220;node_access&#8221; to any query that is run against the node table. Failure to do this is a security violation in Drupal and will get your module flagged by the security team. This is a bit silly, since we need to do this for every query against the node table, why doesn&#8217;t the system just add it for us? The presenter said that a patch to Drupal to provide that would likely get approved, but the design philosophy behind the decision is that it&#8217;s, &#8220;not the API&#8217;s responsibility to tell you how to code correctly&#8221;. The new node_access hook also only works on single nodes, you can&#8217;t run a whole list of them through it, so there is a chance of poor performance for pages that need to load a bunch of nodes and need to execute these functions on all of them.</p>
<h3>Instant Dynamic Forms with #states (Notes by Ian)</h3>
<p><a href="http://sf2010.drupal.org/conference/sessions/instant-dynamic-forms-states">Watch the presentation</a></p>
<p>Drupal 7 adds the &#8220;#states&#8221; form API element which allows us to define behavior for a form specific to a current &#8220;state&#8221; of the form. For instance, a field could become required depending on whether a checkbox elsewhere on the form is checked or unchecked. You can <a href="http://d7.drupalexamples.info/form_example/states">try out an example of this behavior</a>.</p>
<p>Naturally, these examples are really trivial, but this is a framework with a lot of power in it. The other nice thing about using #states to build forms in Drupal 7 is that all of the JavaScript is created by the forms engine. This makes forms easier to build and less prone to bugs. I hate having to debug JavaScript, so I&#8217;m happy to hand off that responsibility to the software.</p>
<p>One potential issue is that the form is rebuilt by the server after every action by the user. If you click a checkbox, the server gets a notice, rebuilds the entire form, and sends it back to you. This makes the form more secure: only the server can decide what elements are part of the form, but potentially very resource intensive. Our Page Settings form already takes a while to load the first time. If it has to load multiple times in between user actions I can see people starting to get frustrated. We&#8217;ll have to wait and see.</p>
<h3>Web Crawling and Search with Nutch and Solr (Notes by Ian)</h3>
<p><a href="http://sf2010.drupal.org/conference/sessions/how-build-jobs-aggregation-search-engine-nutch-apache-solr-and-views-3-about">How to build a Jobs Aggregation Search Engine with Nutch, Apache Solr and Views 3 in about an hour</a></p>
<div id="attachment_22899" class="wp-caption aligncenter" style="width: 510px"><a href="http://sites.middlebury.edu/lis/files/2010/04/Solr-Nutch_Search_Architecture.png"><img class="size-full wp-image-22899" title="Solr-Nutch_Search_Architecture" src="http://sites.middlebury.edu/lis/files/2010/04/Solr-Nutch_Search_Architecture.png" alt="Solr-Nutch Architecture (Diagram by Adam)" width="500" /></a><p class="wp-caption-text">Solr-Nutch Architecture (Diagram by Adam)</p></div>
<p>Apache Solr is a search engine and Apache nutch is a crawler that can be used to populate that search engine. Using these tools together, we can build a local search repository that indexes all the same sites our Google Search Appliance does, but allows us to extend the search experience by adding facets and localized search. Both Solr and Nutch are also available in cloud configurations, meaning that we can offload the processing of these actions if they grow beyond what our local staff and servers can manage.</p>
<p>The biggest advantage of using Solr as a search engine with Drupal is that the two are closely integrated through the <a href="http://drupal.org/project/apachesolr">Apache Solr Drupal module</a>. Rather than use a search crawler that goes through the site, Drupal will periodically send off highly structured data to the Solr search repository, including all of the metadata associated with a node. For instance, we have a node in the site for every staff job descriptions with fields listed for department, position number, and level. A crawler just sees these as plain text, but Drupal sends each field off to Solr as part of the index. So when you search for &#8220;Programmer&#8221; you can filter by level, department, or location. Though it isn&#8217;t mentioned anywhere in the documentation, I learned that these filters are automatically set up for any content type field that is a radio button, checkbox, or drop down menu. For other fields, we can build our own filters.</p>
<p>It&#8217;s true that these services emulate what we already have with our Google Search Appliance. We&#8217;ll be meeting later this week to discuss our search strategy and determine whether it will be beneficial to set up this infrastructure, extend how we use the GSA, both, or neither. I&#8217;m glad I attended this session before trying to make that decision!</p>
<h3>Databases: the Next Generation (Notes by Ian)</h3>
<p><a href="http://sf2010.drupal.org/conference/sessions/databases-next-generation">Watch the presentation</a></p>
<p>The big announcement in this session was that the database abstraction layer in Drupal 7 will use PHP&#8217;s PDO libraries, meaning that Drupal 7 can run with MySQL, postgres, SQLite, MSSQL or Oracle as a database backend. Currently, it can run using MySQL or (with some reservations) postgres. Microsoft also announced at the conference that they have supplied a beta drive for MSSQL to the PHP project and are engaged in improving the driver to access their database, rather than relying on the volunteer community to provide it.</p>
<p>This opens up a lot of features that I bet our Banner programmers would be surprised to learn we didn&#8217;t have in Drupal (since these have been common in the PL/SQL environment forever):</p>
<ul>
<li>Prepared statements</li>
<li>Transactions</li>
<li>Named placeholders for variables in statements</li>
<li>Merge and truncate</li>
<li>Return a result set as any object type</li>
<li>Multi-insert statements</li>
<li>Full master/slave replication support for multiple failover servers</li>
</ul>
<p>The importance of that last point can&#8217;t be overstated. Right now, if our primary database server fails, we can switch over to our backup server. However, MySQL servers can only replicate data in one way, so we can&#8217;t allow you to modify data on the backup server because it will never be written back to the primary server. However, the database abstraction layer is so basic right now that we can&#8217;t differentiate between a query that is writing to the database and one that&#8217;s just reading from the database in our code. So, we solved this by denying our web server permission to execute write statements on the backup server. When the failover occurs, you can try to write data, but you&#8217;ll get a warning and our error log will start piling up the errors too. This isn&#8217;t ideal and the added functionality in Drupal 7&#8242;s database abstraction layer solves this problem.</p>
<p>This was also one of those sessions where I was impressed by the amount of knowledge in the room. A fairly esoteric question was asked about prepared statements: where are they prepared? In the code? By the database? It turns out that, in PHP, the answer is different for each database driver. The MySQL driver prepares the query before passing it off to the database, so it&#8217;s done in C code on the web server. The Oracle driver passes the query to the database to prepare, then fetches the parameters from the web server. The Oracle preparation is more precise and less error prone since the database is doing it, but introduces latency since another request is needed to the web server to get the parameter information.</p>
<h3>MongoDB &#8211; Humongous Drupal (Notes by Ian)</h3>
<p><a href="http://sf2010.drupal.org/conference/sessions/mongodb-humongous-drupal">Watch the presentation</a></p>
<p><a href="http://www.mongodb.org/">MongoDB</a> is a key-value index database that can be used to improve performance for very, very high volume sites. The claim is that storing information on the filesystem using key-value pairs allows it to be access more quickly than storing information in tables, like RDBMS (MySQL, MSSQL, Oracle) do. Since there is no schema, data can be added easily (just append) and indexed on any key. Since it&#8217;s just files, the database can be replicated easily as well. Here&#8217;s an example of a few rows from a MongoDB file:</p>
<pre>{<span>"name"</span> : <span>"mongo"</span> , <span>"_id"</span> : ObjectId(<span>"497cf60751712cf7758fbdbb"</span>)}
{<span>"x"</span> : 3 , <span>"_id"</span> : ObjectId(<span>"497cf61651712cf7758fbdbc"</span>)}
{<span>"x"</span> : 4 , <span>"j"</span> : 1 , <span>"_id"</span> : ObjectId(<span>"497cf87151712cf7758fbdbd"</span>)}
{<span>"x"</span> : 4 , <span>"j"</span> : 2 , <span>"_id"</span> : ObjectId(<span>"497cf87151712cf7758fbdbe"</span>)}</pre>
<p>The same structure in an RDBMS might use three separate tables, one to store x, one for j, and one for name. Depending on how x and j are related, there might be a fourth table involved. This is important for Drupal because all the fields used by content types are stored in separate tables. For our job descriptions, we have a table to store the title of the description, another to store the department for the description, another to store the level for the description and so on. When the node is printed to the page, all of these tables need to be access by joining them together, which can become a resource intensive task. Under high load, this causes problems, particularly in MySQL.</p>
<p>The work being done on MongoDB for Drupal is still in its very early stages and probably won&#8217;t be ready for widespread use for over a year. Right now they have some of the query functions used in Drupal core implemented, but not all of them and not necessarily through the database abstraction layer, meaning that any module using MongoDB with Drupal needs to be rewritten at this time. For those few sites that have enough visitors to warrant using this, that might be an acceptable trade off, but not for us. &#8220;NoSQL&#8221; systems like MongoDB will be one of the most interesting developments in web software this decade. We&#8217;ll have to see how this develops in parallel with the traditional RDBMS systems.</p>
<h3>Scalable infrastructure for Whitehouse.gov (Notes by Adam)</h3>
<p>Frank Febrarro of Phase 2 technology, part of the team that developed Whitehouse.gov, gave a session titled <a href="http://sf2010.drupal.org/conference/sessions/providing-scalable-infrastructure-whitehousegov">Providing a Scalable Infrastructure for Whitehouse.gov</a>. This session talked about all of the techniques used to ensure both that the site would be able to handle the huge visitor load, as well as be extremely secure from defacement and highly available.</p>
<ul>
<li>Infrastructure build-out took a team of the same size and the same amount of time as the development work.</li>
<li>Tested by turning off servers and services to see how the system reacts.</li>
<li>They found that targeting read and write queries to different database servers with MySQL_Proxy worked great in development, but fell down under their heavy load testing. They ended up having to patch Drupal core (by running PressFlow) to enable retargeting from within Drupal.</li>
<li>They run two complete data centers, a production data-center and a disaster-recovery data-center. The disaster recovery data-center also includes development environments and complete data replication so that they can continue operations with a complete loss of the production data center.</li>
<li>The hosts are all RHEL and provisioned with Puppet (60+ servers). They use SELinux to lock down access to files and executables within the hosts and use AIDE to report on unauthorized file access. Puppet allows each type of server to be spawned exactly the same (ensuring that new servers are in audit compliance just like existing ones).</li>
<li>The Akamai content delivery network is used for three services: Site Accelerator (a reverse proxy for handling page caching), Net Storage for file serving, and Live Streaming. 90% of all traffic hits Akamai and doesn&#8217;t need to go through Drupal. Since they have under a hundred authenticated Admin and Editor users (no public users can log in) they have very low authenticated traffic and don&#8217;t need to scale authenticated access much.</li>
<li>PHP code and user files are all served from a NAS system mounted via NFS.</li>
<li>They run Memcache with the consistent hashing strategy that allows a node to fail and cache to still operate.</li>
<li>The database backend is MySQL Enterprise with InnoDB. The also use a RAM-based filesystem for temp-tables to improve the performance of file-sort operations.</li>
</ul>
<dl>
<dt>Database Replication:</dt>
<dd>They use both Master-Master replication as well as Master-Slave replication. The second master is a hot-swap of the primary master as well as handles all of the slave replication. This means that the primary master only has to handle some reads, all writes, and replicating to a single &#8216;slave&#8217;.</dd>
<dt>Monitoring</dt>
<dd>MySQL enterprise monitor, Nagios for infrastructure monitoring, Cacti for graphs.</dd>
<dt>Replication Monitoring:</dt>
<dd>Constantly writing to a file with a pool of active slaves, allowing PHP to switch where reads are going. Custom scripts remove slaves when replication fails, rebuild them, then move them back into the pool once they are repaired.</dd>
<dt>Environmental sync:</dt>
<dd>Changes to servers and files automatically sync to Akami as well as the disaster recovery site.</dd>
<dt>Hardware scaling:</dt>
<dd>Goal is to quickly scale horizontally. Puppet handles the provisioning details, new web servers and database slaves can be brought online in minutes.</dd>
<dt>Data scaling: </dt>
<dd>They receive 15,000+ Webform submissions every day. These are stored outside of the main Drupal database to make site restores much quick by not requiring the rebuilding of many GBs of data.</dd>
<dt>Development Process: </dt>
<dd>They create a branch per issue, then a branch per release &#8212; merging in completed issue fixes.</dd>
<dt>Release Process: </dt>
<dd>They run a full-featured staging environment that allows testing of all aspects of changes.</dd>
<dd>For the published data like Whitehouse visitors, data files are imported using the Drush command line tools.</dd>
</dl>
<h3>RDF in Drupal 7 (Notes by Adam)</h3>
<p>Exclamations of how the &#8220;<a href="http://en.wikipedia.org/wiki/Semantic_Web">semantic web</a>&#8221; is going to revolutionize the internet as we know it have been voiced for a decade. The session <a href="http://sf2010.drupal.org/conference/sessions/story-rdf-drupal7-and-what-it-means-web-large">&#8220;The story of RDF in Drupal 7 and what it means for the Web at large&#8221;</a> described how in the upcoming Drupal 7 semantic tags will be added to Drupal markup as HTML attributes (<a href="http://en.wikipedia.org/wiki/RDFa">RDFa</a>). These attributes will make it easier for machines to understand the meaning and context of words on the page. One example of this used in the description is of a bit of price text having the an RDF attribute <code>&lt;div class="field-item even" property="product:listPrice"&gt;9.99&lt;/div&gt;</code> enabling search engines to display that price in their results rather than just assuming that it is unstructured text. In my view, this change won&#8217;t be earth shattering for quite a while (another few years), but will make some things we currently do now (such as search and data re-purposing) work better without as much server-side programming. In the long run there may be bigger implications.</p>
<p>The second neat thing was a mention of an <a href="http://drupal.org/project/rdfproxy">RDF Proxy Module</a> that will enable Drupal to fetch and display content from remote sites via RDF searching. This is sort of like displaying RSS feeds on steroids. The big difference is that rather than being limited to an Feed that has one or more items of similar format, the RDF queries can grab individual or multiple data-elements from a remote page or many remote pages. Even better, these queries can be run against the normal human-consumable web pages rather than requiring a separate RSS or XML feed to be generated on the source site.</p>
<p>This session also showed the use of <a href="http://sites.middlebury.edu/lis/">Sindice Inspector</a>, a tool for viewing, navigating, and graphing the RDF data in a web page. They showed a cool example of a blog post with graphs linking the RDF data on multiple sites, but I wasn&#8217;t able to find a URL with such complex RDF data.</p>
]]></content:encoded>
			<wfw:commentRss>http://sites.middlebury.edu/lis/2010/04/26/drupalcon-2010-day3/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>DrupalCon 2010 Trip Report &#8211; Day 2</title>
		<link>http://sites.middlebury.edu/lis/2010/04/21/drupalcon2010-day2/</link>
		<comments>http://sites.middlebury.edu/lis/2010/04/21/drupalcon2010-day2/#comments</comments>
		<pubDate>Wed, 21 Apr 2010 09:23:57 +0000</pubDate>
		<dc:creator>Adam Franco</dc:creator>
				<category><![CDATA[LIS Staff Interest]]></category>
		<category><![CDATA[Areas and Workgroups]]></category>
		<category><![CDATA[Conference Reports]]></category>
		<category><![CDATA[conferences]]></category>
		<category><![CDATA[Drupal]]></category>
		<category><![CDATA[Enterprise Applications]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[solr]]></category>
		<category><![CDATA[Web Application Development]]></category>

		<guid isPermaLink="false">http://sites.middlebury.edu/lis/?p=22867</guid>
		<description><![CDATA[Here is an overview and some notes from day 2 of the DrupalCon conference that Ian and I are attending in San Francisco. As Ian mentioned in yesterday&#8217;s report, day 1 of DrupalCon was mostly focused on the future of &#8230; <a href="http://sites.middlebury.edu/lis/2010/04/21/drupalcon2010-day2/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>Here is an overview and some notes from day 2 of the DrupalCon conference that Ian and I are attending in San Francisco. As <a href="http://sites.middlebury.edu/lis/2010/04/20/drupalcon-2010-trip-report-day-1/">Ian mentioned in yesterday&#8217;s report</a>, day 1 of DrupalCon was mostly focused on the future of Drupal, specifically on the changes and improvements in the upcoming Drupal 7. Today&#8217;s sessions dealt much more with the current Drupal release, as well as with version-neutral topics.</p>
<p>Read on for more on the following topics:</p>
<ul>
<li>Drupal deployment strategies</li>
<li>The Chaos tools for Drupal module development</li>
<li>Drupal in Education</li>
<li>Searching with Apache Solr</li>
<li>Recent MySQL happenings</li>
</ul>
<p><span id="more-22867"></span></p>
<h3>Drupal deployment strategies</h3>
<p>The session <a href="http://sf2010.drupal.org/conference/sessions/dont-touch-server-toolkit-zero-touch-production-environments">&#8220;Don&#8217;t Touch that Server&#8221;: A toolkit for zero-touch production environments</a> focused on ways of deploying Drupal servers that don&#8217;t require SSHing into each machine and running updates. While this is hugely useful when managing 10-100s of servers, most of the techniques aren&#8217;t worth the effort for our little cluster of 4 webservers.</p>
<p>I did learn about a few neat tools and techniques that will likely be useful to improve the type of work that we do:</p>
<ul>
<li><a href="http://www.splunk.com/">Splunk</a> &#8212; A tool for aggregating, monitoring and viewing server logs from collected from many systems. This could make it much easier to see trends in the access and error logs of the three main-site web servers.</li>
<li>Syntax checking with <a href="http://www.icosaedro.it/phplint/">PHPLint</a> &#8212; <a href="http://luhman.org/blog/2010/02/12/cheap-php-lint-checking-git">Pre-commit hooks</a> can be set up in our source-control systems just to make sure that we never push a typo to production.</li>
</ul>
<h3>The Chaos tools for Drupal module development</h3>
<p>The session <a href="http://sf2010.drupal.org/conference/sessions/leveraging-chaos-tool-suite-module-development">Leveraging the Chaos tool suite for module development</a> discussed the variety of abilities included in the CTools module that make it much easier to build a variety of dynamic interfaces in Drupal.</p>
<p><strong><a href="http://zroger.com/node/30">Ajax without Javascript</a></strong><br />
AJAX Responder allows passing commands back and forth between JS and PHP.<br />
Drupal7 AJAX framework based off of this CTools implementation.</p>
<p><strong><a href="http://zroger.com/node/31">Ajax modal windows, the easy way</a></strong><br />
Modal Dialogs &#8211; built on top of the AJAX responder. Allows building modal windows with forms all with just a bit of PHP. Handles form validation and submission.</p>
<p>Object cache &#8211; non-volitile cache useful for &#8216;unsaved states&#8217; during multi-step forms.</p>
<p>Form Wizard &#8211; makes it much easier to create multi step forms. Conceptually, its a workflow of a separate single-page forms. Gives you back, finish, cancel, save buttons and controlling widgets and code.</p>
<p>CSS Tools &#8211; disassemble, reassemble, filter by properties/values, reassemble/render-css, compress CSS.</p>
<p>Dependent Fields &#8211; add two additional properties to form fields and the form will be dynamically changed based on choices.</p>
<p>Drop-down links &#8211; basically a single theme function, theme(&#8216;ctools_dropdown&#8217;, &#8230;), to create drop-down js menus like the contextual options  menus in D7 (panels &#8216;cogs&#8217;).</p>
<h3>Drupal in Education</h3>
<p>Before lunch Ian and I both went to a &#8220;Birds of a Feather&#8221; discussion on Drupal usage at Colleges and Universities. I split off with a sub-group to discuss the potential of Drupal as an LMS platform. To kick off efforts in this area, we formed <a href="http://groups.drupal.org/lms-learning-management-system">a new LMS group at groups.drupal.org</a> to discuss what features are needed in Drupal for it to replace Blackboard, Moodle, Sakai, and other LMS systems.</p>
<p>In conversations with Amherst developers over lunch we were reminded that their Ed-Tech group has already built a Gradebook and a Quiz module for Drupal. While these modules are currently tied somewhat to Amherst&#8217;s ERP system (Datatel), with some work they could likely be generalized to work with our Drupal installation as well as those at other schools.</p>
<p>There is another <a href="http://drupal.org/project/quiz">Quiz Module</a> available for Drupal as well.</p>
<h3>Keynote: Tim O&#8217;Reilly</h3>
<p>The keynote today was a talk by Tim O&#8217;Reilly called <a href="http://sf2010.drupal.org/conference/sessions/open-source-cloud-era">Open Source in the Cloud Era</a>. This was a nice talk, but not earth-shattering if you&#8217;ve heard Tim speak at his Web 2.0 conference or elsewhere. Good stuff, familiar theme.</p>
<h3>Searching with Apache Solr</h3>
<p>Ian and I both attended the <a href="http://sf2010.drupal.org/conference/sessions/apache-solr-search-mastery">Apache Solr Search Mastery</a> session. We recently set up a test instance of the Apache Solr search engine and Ian tried to get it operational for doing faceted searching of custom content types on our site. Unfortunately the documentation on how to do this is scattered all over the internet and an operational system wasn&#8217;t created. This session answered all of our questions and should allow us to proceed with setting up a faceted search system as well as other custom search abilities (like section-scoped search) in the future.</p>
<p>Blog posts from the presenters:<br />
<a href="http://acquia.com/blog/advanced-apache-solr-example-ip-based-access">http://acquia.com/blog/advanced-apache-solr-example-ip-based-access</a><br />
<a href="http://evolvingweb.ca/story/apache-solr-mastery-how-add-custom-search-paths-hookmenu">http://evolvingweb.ca/story/apache-solr-mastery-how-add-custom-search-paths-hookmenu</a><br />
<a href="http://acquia.com/blog/understanding-apachesolr-cck-api">http://acquia.com/blog/understanding-apachesolr-cck-api</a></p>
<h4>Notes</h4>
<p><strong>Fixed Fields:</strong><br />
Use the site &amp; hash fields to enable using a single search index for multiple sites.<br />
String type is for exact-matched strings like taxonomy terms rather than partial-matched text.</p>
<p><strong>Dynamic Fields:</strong><br />
Allows you to avoid customizing the the schema for custom content fields. These are set up by having a wild-card field for each data-type used in CCK.<br />
<em>CopyFields</em> for strings allow sorting on string fields.<br />
<em>NodeAccess</em> dynamic field allows restricting results based on permissions for most common node-access modules. Not sure if this will work with MM.</p>
<p><strong>APIs:</strong><br />
<code>hook_apachesolr_update_index</code>: Allows adding extra data (such as thumbnail image URLs) to the search index). </p>
<p><code>hook_apachesolr_node_exclude</code>: Allows custom logic for excluding nodes from search results.</p>
<p><strong>Custom search paths:</strong><br />
Use <code>hook_menu</code> to build up the nice search paths.<br />
Use <code>hook_menu_alter</code> to change the layout of the search page.</p>
<p>Theme search results with custom theme functions.<br />
<em>Note: solr doesn&#8217;t do any security filtering of results.</em></p>
<p><strong>Indexing CCK field info</strong><br />
<a href="http://acquia.com/blog/understanding-apachesolr-cck-api">http://acquia.com/blog/understanding-apachesolr-cck-api</a><br />
6.1 branch &#8212; Fields captured by default: strings in select/options fields<br />
6.2 branch &#8212; Adds date fields.</p>
<p><code>hook_apachesolr_cck_fields_alter(&amp;$mappings)</code>: Used to add/change which fields are indexed and how they are indexed.</p>
<h3>Recent MySQL happenings</h3>
<p>The <a href="http://sf2010.drupal.org/conference/sessions/future-mysql-forks-patches-and-decisions">The Future Of MySQL: Forks, Patches And Decisions</a> session was a good overview of the state of the MySQL database world now that pluggable storage engines are getting more common, Oracle bought Sun (and by extension MySQL-AB), and other developments.</p>
<p>The Oracle InnoDB plugin<br />
<a href="http://www.innodb.com/products/innodb_plugin/">http://www.innodb.com/products/innodb_plugin/</a><br />
- Higher performance version of the InnoDB engine. &#8220;Amazing&#8221;. Upgrade if at all possible. </p>
<p>XtraDB plugin from Percona<br />
<a href="http://www.mysqlperformanceblog.com/2008/12/16/announcing-percona-xtradb-storage-engine-a-drop-in-replacement-for-standard-innodb/">http://www.mysqlperformanceblog.com/2008/12/16/announcing-percona-xtradb&#8230;</a><br />
- A fork of the Oracle InnoDB plugin.<br />
- A big benefit from splitting Buffer Pool Mutex into typed mutexes for each operation so that non-conflicting operations won&#8217;t lock.<br />
- Rewrite of RW Locks.<br />
- More configuration for IO Thread Numbers, IO Capacity.</p>
<p>Ourdelta/Open Query &#8211; provides builds of MySQL with patch-sets from various sources (Google, etc).</p>
<p>MySQL 5.1 &#8211; What&#8217;s new?<br />
- Row level Replication rather than SQL-based replication. Removes the need for a lot of strange locks that were required to get SQL-based replication working. Removes the need for repeatability of SQL statements during replication. Still many stability issues, but hopefully they will be fixed soon.<br />
- InnoDB Plugin</p>
<p>MariaDB/MontyProgram<br />
- Pool of threads like apache rather than forking.</p>
<h4>Upcoming stuff:</h4>
<p>MySQL5.5 &#8211; SemiSynch Replication. Allows you to know that the data has been replicated to at least one slave.</p>
<p>Upcoming in MariaDB 5.2: varchar/blob for heap to prevent temporary tables from going to disk.</p>
<h4>Other Notes:</h4>
<p>Do not put a UNION inside a view! &#8211; Performance nightmare.</p>
]]></content:encoded>
			<wfw:commentRss>http://sites.middlebury.edu/lis/2010/04/21/drupalcon2010-day2/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
