| Plone 4 released – the best open source CMS of 2010?Posted on September 1, 2010 by Mikko OhtamaaFiled Under plone, technology The long awaited version 4 of enterprise grade Plone CMS has been released.
Vote Plone 4 release news on reddit. Yeah, yeah… it is a linkadvertisement
How people perceive Plone outside Plone communityPosted on August 21, 2010 by Mikko OhtamaaFiled Under cms, joomla, plone, technology, web development Our company does business with multiple CMS systems, like Joomla, Plone and Drupal. They all have their advantages, they all have their disadvantages. We do not want to make CMS a religion. It’s a tool. You can argue with the client which tool is a right job for a task. Joomla is lightweight solution for non-critical systems, Plone is good with lots of content, editors and workflows flying around. etc. etc. I had this curious piece on conversation on #joomla channel on freenode. Though it is an individual case, I hope it will bring some light to the fact how people perceive Plone outside Plone community and what Plone should to do fix it. I think it would be beneficial for Plone to finally close mailman for the site administration / user support and move to real web forums / Google Groups / whatever which would be usable. Also, there is an example how unprofessionalism is not good for the community. [20:23:29] x: Biggest problem so far is finding competent ("I will deliver on this schedule) joomla consulting experts. Second biggest problem is security, our site has been hacked 3 times in the past 6 months
[20:25:18] me: have you considered any alternative CMS with better security track record?
[20:25:50] x: moo: we moved from Plone to Joomla. 3 years on Plone with no hacks.
[20:26:04] x: Problem with plone is no forums with email support
[20:26:21] me: you pay for support
[20:26:30] z: did you do basic joomla security guidelines?
[20:26:39] me: also check http://plone.org/support
[20:26:40] Title: Support options for Plone — Plone CMS: Open Source Content Management (at plone.org)
[20:26:51] x: moo: I'm fine with paying for support. We're paying SiteGround $200-$300/month on average when you add the support costs.
[20:26:58] z: ie using a key to access admin, changing default sql prefix
[20:27:01] AngryPerson: who cares about plone
[20:27:10] AngryPerson: its an ancient cms thats clearly past its time
[20:27:20] AngryPerson: its only privately supported with little community support
[20:28:34] AngryPerson: Moo^_^: why are you even in here?
[20:28:40] AngryPerson: you just want to piss on joomla?
[20:28:43] me: we do business on drupal, joomla and plone
[20:28:48] me: different tool for different job
[20:28:51] -*- y shrugs
[20:29:01] AngryPerson: just seems to me like you want to push ppl away from joomla
[20:29:11] x: z: After 6 months I'm still a Joomla noob. I need a consulting services company that will do the security patching, maintenance, service on the site, and host it.
[20:29:25] me: not true
[20:29:36] AngryPerson: Moo^_^: well regardless of what you say, tahts how it seems to me
[20:29:54] z: actually I've had good luck just following a few blog posts
[20:30:10] me: I don't defend myself, as I don't want to engage such a conversation with you
[20:30:22] AngryPerson: thats good, why dont you fuck off too
[20:30:22] <-* jools has kicked y from #joomla (Please watch your language) [20:30:22] --> y (dgdf@unaffiliated/anti-mttr/x-9384728) has joined #joomla
[20:30:27] z: nothing is impervious, but you drastically reduce your attractiveness to hackers by a few simple steps
[20:30:32] AngryPerson: stop giving ppl your shitty advice
Automatically generating description based on body textPosted on June 4, 2010 by Mikko OhtamaaFiled Under plone, python, technology Below is a sample script to automatically generate descriptions based on page body text. It is for Plone CMS, but should be applicable to any Python based CMS with some modifications. The idea is that we take three first sentences and use them as a description. Use case: People are lazy to write descriptions (descriptions as in Dublin Core metadata). You can generate some kind of description by taking the few first sentences of the text. This is not perfect, but this is way better than empty description. Also, the script comes with good comments which should be helpful for beginner Plone programmers.
Please comment if you have other simple ideas to generate descriptions.
Usage
Since Zope uses RestrictedPython for through-the-web created scripts, the user of this script cannot breach the server security (they cannot make Python calls they have no permission for). This sets some limitations for automating tasks like this, but we don’t hit those limitations in our use case. def create_automatic_description(content, text_field_name="text"):
""" Creates an automatic description from HTML body by taking three first sentences.
Takes the body text
@param content: Any Plone contentish item (they all have description)
@param text_field_name: Which schema field is used to supply the body text (may very depending on the content type)
"""
# Body is Archetype "text" field in schema by default.
# Accessor can take the desired format as a mimetype parameter.
# The line below should trigger conversion from text/html -> text/plain automatically using portal_transforms
field = content.Schema()[text_field_name]
# Returns a Python method which you can call to get field's
# for a certain content type. This is also security aware
# and does not breach field-level security provded by Archetypes
accessor = field.getAccessor(content)
# body is UTF-8
body = accessor(mimetype="text/plain")
# Now let's take three first sentences or the whole content of body
sentences = body.split(".")
if len(sentences) > 3:
intro = ".".join(sentences[0:3])
intro += "." # Don't forget closing the last sentence
else:
# Body text is shorter than 3 sentences
intro = body
content.setDescription(intro)
# context is the reference of the folder where this script is run
for id, item in context.contentItems():
# Iterate through all content items (this ignores Zope objects like this script itself)
# Use RestrictedPython safe logging.
# plone_log() method is permission aware and available on any contentish object
# so we can safely use it from through-the-web scripts
context.plone_log("Fixing:" + id)
# Check that the description has never been saved (None)
# or it is empty, so we do not override a description someone has
# set before automatically or manually
desc = context.Description() # All Archetypes accessor method, returns UTF-8 encoded string
if desc is None or desc.strip() == "":
# We use the HTML of field called "text" to generate the description
create_automatic_description(item, "text")
# This will be printed in the browser when the script completes succesfully
return "OK"
Integrating and theming WordPress with your CMS site using XDVPosted on March 28, 2010 by Mikko OhtamaaFiled Under Wordpress, apache, cms, css, plone, python, technology, web development, xdv, zope IntroductionXDV is an external HTML theming engine, a.k.a. theming proxy, which allows you to mix and match HTML and CSS from internal and external sites by using simple XML rules. It separates the theme development from the site development, so that people with little HTML and CSS knowledge can create themes without need to know underlying Python, PHP or whatever. It also enables integration of different services and sites to one, unified, user experience. For example, XDV is used by plone.org <http://plone.org> to integrate Plone CMS and Trac issue tracker. XDV compiles theming rules to XSL templates, which has been a standard XML based templates language since 1999. XSL has good support in every programming language and web server out there. Example backends to perform XSL transformation include
XDV theming can be used together with Plone where enhanced support is provided by collective.xdv package package. Technically, collective.xdv adds Plone settings panel and does XSL transformation in Zope’s post-publication hook using lxml library. XDV can be used standalone with XDV package to theme any web site, let it be WordPress, Joomla, Drupal or custom in-house PHP solution from year 2000. XDV is based on Deliverance specification The difference between XDV and Deliverance reference implementation is that XDV internally compiles themes to XSL templates, when Deliverance relies on processing HTML in Python. Currently XDV approach seems to be working better, as we had many problems trying to apply Deliverance for WordPress site (redirects didn’t work, HTTP posts didn’t work, etc.). Setting up XDV development toolsXDV tools are deployed as Python eggs. You can use tools like buildout <http://www.buildout.org/> configuration and assembly tool or easy_install to get XDV on your development computer and the server. If you are working with Plone you can integrate XDV to your site existing buildout. If you are not working with Plone, XDV home page has instructions how to deploy XDV command standalone. XDV RulesRules (rules.xml) will tell how to fit content from external source to your theme HTML. It provides straightforward XML based syntax to manipulate HTML easily
Rules XML syntax is documented at XDV homepage. Rules will be compiled to XSL template (theme.xsl) by xdvcompiler command. The actual theming is done by one of the XSL backends listed above, by taking HTML as input and applying XSL transformations on it. Note that currently rules without matching selectors are silently ignored and there is no bullet-proof way to debug what happens inside XSL transformation, except by looking into compiled theme.xsl. Using XDV to theme and integrate a WordPress siteBelow are instructions how to integrate a WordPress site to your CMS. In this example CMS is Plone, but it could be any other system. We will create XDV theme which will theme WordPress site to match our CMS site in the fly. WordPress theme using built with XDV and using a live Plone web page as a theme template. This way WordPress theme inherits “live data” from Plone site, like top tabs (portal sections), footer, CSS and other stuff which can be changed in-the-fly and reflecting changes to two separaet theming products would be cumbersome. Benefits using WordPress for blogging instead of main CMS
Benefits of using XDV theming instead of creating native WordPress theme are
Theme elementsThe theme will consist of following pieces
CMS page templateThis explains how to create a Plone page template where WordPress content will be dropped in. This step is not necessary, as we could do this without touching the Plone. However, it makes things more straightforward and explicit when we known that WordPress theme uses a certain template and we explicitly define slots for WordPress content there. Example: <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"
xmlns:tal="http://xml.zope.org/namespaces/tal"
xmlns:metal="http://xml.zope.org/namespaces/metal"
xmlns:i18n="http://xml.zope.org/namespaces/i18n"
lang="en"
metal:use-macro="here/main_template/macros/master"
i18n:domain="plone">
<body>
<div metal:fill-slot="content">
<div id="wordpress-content">
<!-- Your WordPress "left column" will go there -->
</div>
</div>
</body>
</html>
Theming rulesFollowing are XDV rules (rules.xml) how we will fit WordPress site to Plone frame. It will integrate
rules.xml: <?xml version="1.0" encoding="UTF-8"?>
<rules xmlns="http://namespaces.plone.org/xdv"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:css="http://namespaces.plone.org/xdv+css">
<!-- Remove WordPress CSS by filtering out <style> tags-->
<drop css:content="style" />
<!-- Make sure that WordPress metadata is present in <head> section -->
<append css:content="head link" css:theme="head" />
<!-- note: replace does not seem to handle multiple meta tags very well -->
<drop css:theme="meta" />
<append css:content="head meta" css:theme="head" />
<!-- Use blog title instead of Plone page title -->
<replace css:content="title" css:theme="title" />
<!-- Put WordPress sidebar to Plone's portlets section -->
<append css:content="#r_sidebar" css:theme="#portal-column-one .visualPadding" />
<!-- Place wordpress content into our theme content area -->
<copy css:content="#contentleft" css:theme="#wordpress-content" />
<!-- This mixes in WordPress specific CSS sheet which is applied for pages
served from WordPress only and does not concern Plone CMS.
This stylesheet will theme WordPress specific tags,
like blog posts and comment fields.
We keep this file in Plone, but this could be served from elsewhere. -->
<append css:theme="head">
<style type="text/css">
@import url(http://mfabrik.com/++resource++plonetheme.mfabrik/wordpress.css);
</style>
</append>
<!-- This stylesheet is used by special spam protection plug-in NoSpamNX -->
<append css:theme="head">
<link rel="stylesheet" href="http://blog.mfabrik.com/wp-content/plugins/nospamnx/nospamnx.css" type="text/css" />
</append>
<!-- Remove Google Analytics script used for CMS site -->
<drop css:theme="#page-bottom script" />
<!-- Rebuild our Google Analytics code, using a different tracker id this time
which is a specific to our blog.
-->
<append css:theme="#page-bottom">
<script type="text/javascript">
var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
</script>
<script type="text/javascript">
try {
var pageTracker = _gat._getTracker("UA-8819100-2");
pageTracker._trackPageview();
} catch(err) {
}
</script>
</append>
</rules>
WordPress specific CSSThis CSS has styles which are applied only to WordPress pages. They are mainly corner case fixes where WordPress and CMS styles must match. The CSS file is loaded when rules.xml injects it to <head> section. wordpress.css: /* Font and block style fixes */
#wordpress-content h1 {
border: 0;
}
#wordpress-content .post-end {
margin-bottom: 60px;
}
#wordpress-content pre {
width: 600px;
overflow: auto;
background: white;
border: 1px solid #888;
}
#wordpress-content ul {
margin-left: 20px;
}
#wordpress-content .post-info-date,
#wordpress-content .post-info-categories,
#wordpress-content .post-info-tags {
font-size: 80%;
color: #888;
}
/* Make sure that posts and comments look sane in our theme */
#wordpress-content .post {
margin-top: 15px;
}
#wordpress-content .commentlist li {
margin: 20px;
background: white;
padding: 10px;
}
#wordpress-content .commentlist li img {
float: left;
margin-right: 20px;
margin-bottom: 20px;
}
#wordpress-content #commentform {
margin: 20px;
}
#wordpress-content {
margin-left: 20px;
margin-right: 20px;
}
/* Make WordPress "sidebaar" look like Plone "portlets */
.template-wordpress_listing #portal-column-one ul {
list-style: none;
margin-bottom: 40px;
}
.template-wordpress_listing #portal-column-one ul#Recent li {
margin-bottom: 8px;
}
.template-wordpress_listing #portal-column-one ul#Categories a {
line-height: 120%;
}
.template-wordpress_listing #portal-column-one h2 {
background: transparent;
border: 0;
font-weight:normal;
line-height:1.6em;
padding:0;
text-transform:none;
font-size: 16px;
color: #9b9b9b;
border-bottom:4px solid #CDCDCD;
}
Helper scriptThe following Python script (xdv.py) makes it easy for us
xdv.py: """
This command line Python script compiles your rules.xml to XDV XSL
Modify it for your own needs.
It assumes your buildout.cfg has xdv section and generated XDV
commands under bin/
To compile, execute in the buildout folder::
python src/plonetheme.mfabrik/xdv.py
To build test HTML::
python src/plonetheme.mfabrik/xdv.py --test
To build test HTML and preview it in browser, execute in buildout folder::
python src/plonetheme.mfabrik/xdv.py --preview
"""
import getopt, sys
import os
import webbrowser
# rules XML for theming
RULES_XML = "src/plonetheme.mfabrik/deliverance/etc/rules.xml"
# Which XSL file to generate for compiled XDV
OUTPUT_FILE = "theme.xsl"
# Which file to generate applied theme test runs
TEST_HTML_FILE = "test.html"
# Our "theme.html" is a remote template served for each request.
# Because we are doing live integrattion, this is a HTTP resource,
# not a local file.
THEME="http://mfabrik.com/news/wordpress_listing/"
#
# External site you are theming.
# Note: must have ending slash (lxm cannot handle redirects)
#
SITE="http://blog.twinapex.fi/"
try:
opts, args = getopt.getopt(sys.argv[1:], "pt", ["preview", "test"])
except getopt.GetoptError, err:
# print help information and exit:
print str(err) # will print something like "option -a not recognized"
# Convert options to simple list
opts = [ opt for opt, value in opts ]
print "Compiling transformation"
value = os.system("bin/xdvcompiler -o " + OUTPUT_FILE + " " + RULES_XML +" " + THEME)
if value != 0:
print "Compilation failed"
sys.exit(1)
if "-p" in opts or "--preview" in opts or "-t" in opts or "--test" in opts:
print "Generating test HTML page"
value = os.system("bin/xdvrun -o " + TEST_HTML_FILE + " " + OUTPUT_FILE + " " + SITE)
if value != 0:
print "Page transformation failed"
sys.exit(1)
if "-p" in opts or "--preview" in opts:
# Preview the result in a browser
# NOTE: OSX needs Python >= 2.5 to make this work
# Make sure test run succeeded
url = "file://" + os.path.abspath(TEST_HTML_FILE)
print "Opening:" + url
# We prefer Firefox for preview for its superious
# Firebug HTML debugger and XPath rule generator
try:
browser = webbrowser.get("firefox")
except webbrowser.Error:
# No FF on the system, or OSX which can't find its browsers
browser = webbrowser.get()
browser.open_new_tab(url)
Compiling the themeThis will generate XSL templates to do theming transform. It will compile rules XML with some boilerplate XSL. Running our compile script: python src/plonetheme.mfabrik/xdv.py Since Plone usually does not use any relative paths or relative resources in HTML, we do not give the parameter “Absolute prefix” to the compilation stage. In Plone, everything is mapped through a virtual hosting aware resource locator: portal_url and VirtualHostMonster. For more information see Testing the themeThe following command will apply theme for an example external page: bin/xdvrun -o theme.html theme.xsl http://blog.twinapex.fi firefox theme.xhtml … or we can use shortcut provided by our script … python src/plonetheme.mfabrik/xdv.py --preview Applying the theme in Apache production environmentThese steps tell how to apply the integration theme for WordPress when WordPress is running under Apache virtualhost. Installing dependenciesWe use Apache and mod_transform. Instructions how to set up modules for Apache are available on XDV homepage. Some hand-build modules must be used, but instructions to set them up for Ubuntu / Debian are available. Apache 2 supports filter chains which allow you to perform magic on HTTP response before sending it out. This corresponds Python’s WSGI middleware. We’ll use special built of mod_transform and mod_depends which are known to working. These modules were forked from their orignal creations to make them XDV compatible, as the orignal has not been updated since 2004 (here you can nicely see how open source guarantees “won’t run out of support” freedom). Example: sudo -i apt-get install libxslt1-dev libapache2-mod-apreq2 libapreq2-dev apache2-threaded-dev wget http://html-xslt.googlecode.com/files/mod-transform-html-xslt.tgz wget http://html-xslt.googlecode.com/files/mod-depends-html-xslt.tgz tar -xzf mod-transform-html-xslt.tgz tar -xzf mod-depends-html-xslt.tgz cd mod-depends-html-xslt ; ./configure ; make ; make install ; cd .. cd mod-transform-html-xslt ; ./configure ; make ; make install ; cd .. Enable built-in Apache modules: a2enmod filter a2enmod ext_filter For modules depends and transform you need to manually add them to the end of Apache configuration, as they do not provide a2enmod stubs for Debian. Edit /etc/apache2/apache.conf: LoadModule depends_module /usr/lib/apache2/modules/mod_depends.so LoadModule transform_module /usr/lib/apache2/modules/mod_transform.so You need to hard reset Apache to make the new modules effective: /etc/init.d/apache2 force-reload Virtual host configurationBelow is our virtualhost configuration which runs WordPress and PHP. Transformation filter chain has been added in. /etc/apache/sites-enabled/blog.mfabrik.com: <VirtualHost *>
ServerName blog.mfabrik.com
ServerAdmin info@mfabrik.com
LogFormat combined
TransferLog /var/log/apache2/blog.mfabrik.com.log
# Basic WordPress setup
Options +Indexes FollowSymLinks +ExecCGI
DocumentRoot /srv/www/wordpress
<Directory /srv/www/wordpress>
Options FollowSymlinks
AllowOverride All
</Directory>
AddType application/x-httpd-php .php .php3 .php4 .php5
AddType application/x-httpd-php-source .phps
# Theming set-up
# This chain is used for public web pages
FilterDeclare THEME
FilterProvider THEME XSLT resp=Content-Type $text/html
TransformOptions +ApacheFS +HTML
# This is the location of compiled XSL theme transform
TransformSet /theme.xsl
# This will make Apache not to reload transformation every time
# it is performed. Instead, a compiled version is hold in the
# virtual URL declared above.
TransformCache /theme.xsl /srv/plone/twinapex.fi/theme.xsl
# We want to apply theme only for
# 1. public pages (otherwise WordPress administrative interface stops working)
<Location "/">
FilterChain THEME
</Location>
# 2. Admin interface and feeds should not receive any kind of theming
<LocationMatch "(wp-login|wp-admin|wp-includes)">
# The following resets the filter chain
# http://httpd.apache.org/docs/2.2/mod/mod_filter.html#filterchain
FilterChain !
</LocationMatch>
</VirtualHost>
Running itAfter Apache has all modules enabled and your virtualhost configuration is ok, you should see WordPress through your new theme by visiting at the site served through Apache: Automatically reflecting CMS changes back to XDV themeThe theme should be recompiled every time
This is because the compilation will hard-link resources and template snippets to resulting the theme.xsl file. If hard-linked resources change on the Plone site, the transformation XSL file does not automatically reflect back the changes. It could be possible to use Plone events automatically to rerun theme compilation when concerned resources change. However, the would be quite complex. For now, we are satisfied with a scheduled task which will recompile the theme now and then. Alternatively, mod_transforms could be run in non-cached mode with some performance implications. Here is a shell script, update-wordpress-theme.sh, which will perform the recompilation and make Apache’s transformation cache aware of changes: #!/bin/sh # # Periodically update WordPress theme to reflect changes on CMS site # # Recompile theme sudo -H -u twinapex /bin/sh -c cd /srv/plone/twinapex.fi ; python src/plonetheme.mfabrik/xdv.py # Make Apache aware of theme changes sudo apache2ctl graceful Then we call it periodically in cron job, every 15 minutes in /etc/cron.d/update-wordpress: # Make WordPress XDV theme to reflect changes on CMS 0,15,30,45 * * * * /srv/plone/twinapex.fi/update-wordpress-theme.sh Updating WordPress settingsNo changes on WordPress needed if the domain name is not changed in the theme transformation process. Site URLUnlike Plone, WordPress does not have decent virtual hosting machinery. It knowns only one URL which is uses to refer to the site in the external context (e.g. RSS feeds). This setting can be overridden in
Here is an example how we override this in our wp-config.php: // http://codex.wordpress.org/Editing_wp-config.php#WordPress_address_.28URL.29
define('WP_HOME','http://blog.mfabrik.com');
define('WP_SITEURL','http://blog.mfabrik.com');
HTTP 404 Not Found special caseHttp 404 Not Found responses are not themed by Apache filter chain. This is not possible due to order of pipeline in Apache. As a workaround you can set up a custom HTTP 404 page in WordPress which does not expose the old theme.
For more information see Roll-out checklistBelow is a checklist you need to go to through to confirm that the theme integration works on your production site
SEO tips: query strings, multiple languages, forms and other content management system issuesPosted on August 7, 2009 by Mikko OhtamaaFiled Under plone, technology This post is collection of search engine optimization tips for content management systems, especially for Plone. Do not index query stringsIt is often desirable to make sure that query string pages (http://yoursite/page?query_string_action=something) do not end up into the search indexes. Otherwise search bots might index pages like site’s own search engine results (yoursite/search?SearchableText=…) lowering the visibility of actual content pages. GoogleBot has regex support in robots.txt and can be configured to ignore any URL ? in it. See the example below. Query string indexing causes the crawler crawl things like
Also, “almost” human readable query strings look ugly in the address bar… Top level domains and languagesUsing top level domain name (.fi for Finland, .uk for United Kingdoms, and so on.) to make distinction between different languages and areas is optimal solution from the SEO point of view. Search engines use TLD information to reorder the search results based on where the search query is performed (there is difference between google.com and google.fi results). Plone doesn’t use any query strings for content pages. Making robots to ignore query strings is especially important if you are hosting multilingual site and you use top level domain name (TLD) to separate languages: if you don’t configure robots.txt to ignore ?set_language links only one of your top level domains (.com, .fi, .xxx) will get proper visibility in the search results. For example we had situation where our domain www.twinapex.fi did not get proper visibility because Google considered www.twinapex.com?set_language=fi as the primary content source (accessing Finnish content through English site and language switching links). Shared formsPlone has some forms (send to, login) which can appear on any content page. These must be disallowed or otherwise you might have a search result where the link goes to the form page instead of the actual content page. Hidden content and content excluded from the navigationAny content excluded from the sitemap navigation should be put under disallowed in robots.txt. E.g. if you check “exclude from navigation” for Plone folder remember to update robots.txt also. In our case, our internal image bank must not end up being indexed, though images themselves are visible on the site. Otherwise you get funny search result: if you search by person’s name the photo will be the first hit instead of biography. Sitemap protocolCrawlers use Sitemap protocol to help determining the content pages on your site (note: sitemap seems to be used for hinting only and it is not authoritative). Since version 3.1 Plone can automatically generate sitemap.xml.gz. You still need to register sitemap.xml.gz in Google webmaster tools manually. There exists a sitemap protocol extension for mobile sites. Webmaster toolsGoogle Webmaster tools enable you to monitor your site visibility in Google and do some search engine specific tasks like submitting sitemaps. I do not know what kind of similar functionality other search provides have. Please share your knowledge in the blog comments regarding this. HTML <head> metadataSearch engines mostly ignore <meta> tags besides title so there is no point of trying fine-tune them. Example robots.txtHere is our optimized robots.txt for www.twinapex.com: # Normal robots.txt body is purely substring match only # We exclude lots of general purpose forms which are available in various mount points of the site # and internal image bank which is hidden in the navigation tree in any case User-agent: * Disallow: set_language Disallow: login_form Disallow: sendto_form Disallow: /images # Googlebot allows regex in its syntax # Block all URLs including query strings (? pattern) - contentish objects expose query string only for actions or status reports which # might confuse search results. # This will also block ?set_language User-Agent: Googlebot Disallow: /*?* Disallow: /*folder_factories$ # Allow Adsense bot on entire site User-agent: Mediapartners-Google* Disallow: Allow: /* Useful resources
XHTML mobile profile transformer and cleaner for PythonPosted on July 23, 2009 by Mikko OhtamaaFiled Under mobile, plone, python, technology Mobile phones, and especially mobile site validators, are very picky about the validy of XHTML. It must not be any XHTML, but special mobile profile XHTML. Also, search engines like Google, will punish you in the mobile search results if your site fails to conform to mobile profile. This is especially troublesome if you display external content (RSS feeds, ATOM feeds) on your mobile site. Incoming HTML cannot be guaranteed to follow any specification. To solve this problem, we have created gomobile.xhtmlmp Python library which helps you to transform any HTML to content to valid XHTML MP. The library is piloted on plonecommunity.mobi site which uses aggregated content from varying sources. The library is based on lxml.html.Cleaner. The library is part of GoMobile project which aims to create world class Python mobile web development tools. Highlights
As an example we integrated gomobile.xhtmlmp to Feedfeeder Plone add-on product. Enjoy. Plone goes mobilePosted on July 9, 2009 by Mikko OhtamaaFiled Under Business, plone, technology FYI Plone GoMobile project. plonecommunity.mobi demo site. More to come. Zope Zeo vs. standalone setupsPosted on July 7, 2008 by Tuukka MustonenFiled Under Plone (old), Red innovation, apache, database, linux, performance, ubuntu, zope We do some Plone development here at Redi. As known, Plone is a powerful, but unfortunately quite a heavy CMS which is best suited for Intranets. Thus, we are always looking for speed increase. Enter Zeo cluster – a feature that nowadays comes bundled with Zope and allows one database (practically Data.fs) to be used by multiple Zope instances, or more accurately Zeo clients. In standalone installation only one CPU / CPU core can be used for processing requests (as Zope / Python implementation is single-threaded AFAIK). So if there are any concurrent requests the database (ZODB, the Zope Object Database) usually has to wait for the request processing before it is asked for the data and only part of the processing power is used as requests are queued. Using Zeo server-client architecture however, each Zeo client can do the processing on their own CPU/core (thus efficiently using the whole CPU prosessing power available) and also minimize the hard disk idle time by asking for data in an ~asynchronous manner (in separate queues). Actually ZODB even serves the same object simultaneously to different client processes for performance reasons. This might raise database ConflictErrors, which are nothing to fear of, however, as noted some paragraphs below. Similarly, you could also deploy Zeo clients on different computers in local network (or wherever you want), but that’s not the scope of this article. Having clients running on different machines is a similar case with the same performance basis, but there are connection lags, bandwith limits and such that decrease performance. Theory vs. practiceDeploying a Zeo cluster instead of standalone Zope instance should theoretically increase the performance by factor of extra available CPUs / CPU cores. There might be some overheads from this setup though, so we tested it out using ApacheBenchmark – the benchmarking module that comes bundled with Apache nowadays. But first something about… Setting up Zeo & converting from standalone modeIn the easiest scenario, setting Zeo up is rather easy: the unified installer supports Zeo-server setup out of the box (=there is a recipe for it). Just run the unified installer like: $ ./install.sh zeo Luckily, the unified installer uses buildout from Plone 3.1 onwards. Thus, converting your current buildout instances to Zeo cluster is nothing but change of buildout configuration. Where you would normally need ‘instance’ section in your buildout.cfg you will now need the following: [zeoserver]
recipe = plone.recipe.zope2zeoserver
zope2-location = ${zope2:location}
zeo-address = 127.0.0.1:12000
#effective-user = __EFFECTIVE_USER__
[client1]
recipe = plone.recipe.zope2instance
zope2-location = ${zope2:location}
zeo-client = true
zeo-address = ${zeoserver:zeo-address}
# The line below sets only the initial password. It will not change an
# existing password.
user = admin:mysecretpassword
http-address = 12001
#effective-user = __EFFECTIVE_USER__
#debug-mode = on
#verbose-security = on
# If you want Zope to know about any additional eggs, list them here.
# This should include any development eggs you listed in develop-eggs above,
# e.g. eggs = ${buildout:eggs} ${plone:eggs} my.package
eggs =
${buildout:eggs}
${plone:eggs}
# If you want to register ZCML slugs for any packages, list them here.
# e.g. zcml = my.package my.other.package
zcml =
products =
${buildout:directory}/products
${productdistros:location}
${plone:products}
To add more clients (which is quite the point here), append as many times the extra client sections like this: [client2]
recipe = plone.recipe.zope2instance
zope2-location = ${zope2:location}
zeo-client = true
zeo-address = ${zeoserver:zeo-address}
user = ${client1:user}
http-address = 12002
#effective-user = __EFFECTIVE_USER__
#debug-mode = on
#verbose-security = on
eggs = ${client1:eggs}
zcml = ${client1:zcml}
products = ${client1:products}
That minimizes the need for retyping user names, password etc. These examples were taken from Plone unified installer buildout.cfg with ports changed. Starting, stopping & restartingNow, to start your Zeo-powered Plon clients you could type: bin/zeoserver start bin/client1 start bin/client2 start ...same for all the clients... However, the unified installer has a recipe which automatically generates nice and simple shell scripts to control your cluster. In the end of your buildout.cfg, add: [unifiedinstaller]
recipe = plone.recipe.unifiedinstaller
user = ${client1:user}
primary-port = ${client1:http-address}
That should generate the scripts. In fact, it propably does also something else, something which I’m not aware of. However, I didn’t bump into any problems, yet bin/startcluster.sh And that does it (it start server and the clients). Shut it down via: bin/shutdowncluster.sh And restart: bin/restartcluster.sh ConflictErrors – not that errerousAs noted before, in Zeo mode the ZODB might serve the same objects to two more clients at the same time. If one client manipulates the object before others (ie. edits values and saves changes) the other requests will propably fail. This raises ConflicError which looks like this: ConflictError: database conflict error (oid 0x0f39, class HelpSys.HelpSys.ProductHelp) In this case ZODB tries to reprocess the failed requests. This should be common database approach and thus a feature, not a bug (although Zope might want to tell that in error message!). For more accurate explanation see Plone discussion. Parsing it together with web serverThe Zeo components (server and clients) talk to each other via standard Internet protocols (TCP or UDP, not sure). In the default setup, the Zeo server listens to port 8100 and Zeo clients to 8080, 8081, etc. Thus, to access the separate clients as ‘one site’ we need to serve the requests to multiple clients. This can be achieved with load balancers. Apache has at least one: mod_proxy_balancer which should do exactly what we need. Apache isn’t the best choice for achieving high requests per second values, but it will do for our tests (compare to more lightweight but also more limited lighttpd). Just remember that there are other alternatives/methods available, like using squid as load balancer. Our configuration is as follows (inside VirtualHost-directive): <Proxy balancer://lb>
BalancerMember http://127.0.0.1:12001/
BalancerMember http://127.0.0.1:12002/
BalancerMember http://127.0.0.1:12003/
BalancerMember http://127.0.0.1:12004/
</Proxy>
<Location /balancer-manager>
SetHandler balancer-manager
Order Deny,Allow
Allow from all
</Location>
ProxyPass /balancer-manager !
ProxyPass / balancer://lb/http://localhost/VirtualHostBase/http/www.mydomain.com:80/plonesite/VirtualHostRoot/
ProxyPassReverse / balancer://lb/http://localhost/VirtualHostBase/http/www.mydomain.com:80/plonesite/VirtualHostRoot/
This setup also allows us to use the balancer-manager (accessible at /balancer-manager) that comes with mod_proxy_balancer. It’s useful for checking if the configuration is working and balancer is dividing the requests equally. In my setup the balancer is using the default Request Counting -algorithm which divides the requests numerically equally between the instances, but you might want to also try Weighted Traffic Counting, which should be for actual use. In our test only the frontpage is accessed however, so each request’s data transfer is equal and the weighted traffic counting isn’t of use. The testThe server machine
The setup
The tests where run locally in development environment to minimize the network lag (was 0-1ms). The test commandsApacheBenchmark commands: $ ab -n N -c C myurl where N was either 1000 or 9000 (requests) and C 1, 10, 100 or 1000 (concurrent requests). The resultsYou can download the more in-depth test sheet Plone Standalone vs. Zeo installation (PDF). To put it simple: theory and practise meet well – Zeo server is a lot more powerful with concurrent requests. On non-concurrent requests the results are about the same. Having as many Zeo clients as CPUs / CPU cores can boost the performance up to number of extra CPUs/cores. For example, in our quad-core server with Zeo setup we gained nearly 4 times the requests per second of standalone installation (~370% to be accurate). Increasing Zeo clients to 6 didn’t help any as there’s no processing power left from 4 heavily stressed client processes. Also to be noted is that the waiting times for clients nearly tripled (median jumped from 126 to 305 ms) when raising concurrency from 1 to 10. This isn’t bad though – those are still low figures compared to standalone’s median of 1215 ms! Only when raising concurrency to 100 we began to see some 3,6 seconds waiting times (6 seconds for standalone). Increasing concurrency didn’t bring down the requests/second rates much (less than 5%) as expected. Overall, the results were expected, but now we have evidence of it: under concurrent request load Zeo server is a good option to multiply the performance of your site. With very low traffic sites which rarely get more than 1 request at time this doesn’t matter. One bad word about the resource requirements though: The used RAM increase for 6 client Zeo setup (standard Plone 3.1.2 + 12 additional Products) was whopping 621 MB (1132 MB -> 1753 MB). That means about 100 MB per Zeo client as the Zeo server memory intake was only about 12-15 MB. Thus, only use as many Zeo clients as absolutely necessary or you might find your beloved server machine under very serious Zope flu! |
