|
PyDev, Python and system default Unicode encoding problemPosted on January 24, 2010 by Mikko OhtamaaFiled Under aptana studio, eclipse, plone, pydev, technology Python 2 has a thing called “default encoding” to automagically encode Unicode strings when they are presented as byte strings. This is evil and has been discussed various times before. What could be even more evil? Something in your development environment messes this setting set for you, without telling you that. This way you never encounter Unicode problems on your development computer and when you roll out your seemingly working code to production, the world goes haywire. Evil. Evil. Evil. Thousands of curses and overworking hours to fix the problems. I encountered this problem. And this is the code I used to track the problem down in site.py: # Trap the bastard messing with the default encoding
# using a monkey patch
old_set_default_encoding = sys.setdefaultencoding
def aargh(x):
import pdb ; pdb.set_trace()
sys.setdefaultencoding = aargh
And the result was surprising:
Looks like the culprint was PyDev (Eclipse Python plug-in). The interfering source code is here. Looks like the reason was to co-operate with Eclipse console. However it has been done incorrectly. Instead of setting the console encoding, the encoding is set to whole Python run-time environment, messing up the target run-time where the development is being done. There is a possible fix for this problem. In Eclipse Run… dialog settings you can choose Console Encoding on Common tab. There is a possible value US-ASCII. I am not sure what Python 2 thinks “US-ASCII” encoding name, since the default is “ascii”. Installing Python Imaging Library (PIL) under virtualenv or buildoutPosted on November 19, 2009 by Mikko OhtamaaFiled Under plone, python, technology I have greatly struggled to have PIL library support in isolated Python environments like virtualenv –no-site-packages. For example, when installing Satchmo shop under virtualenv:
Though it clearly is there, installed by easy_install PIL command: ls ../lib/python2.5/site-packages/PIL-1.1.7-py2.5-linux-x86_64.egg ArgImagePlugin.py ExifTags.py GimpGradientFile.pyc... Does anyone know if this problem is with PIL itself, eggified PIL or something else? In any case, there is an easy workaround: use system-wide PIL (sudo apt-get install python-imaging) and symlink PIL from your site-wide installation under the isolated Python environment: (satchmo-py25)mulli% pwd /srv/plone/mmaspecial/satchmo-py25/lib/python2.5/site-packages (satchmo-py25)mulli% ln -s /usr/lib/python2.4/PIL . That works for now, but I’d like to learn how to make virtualenv and buildout install PIL egg bullet-proof way.
Cannot sort custom content item in Plone folder listingPosted on October 5, 2009 by Mikko OhtamaaFiled Under plone, technology Bug: Plone folder manual sorting does not move items even though you try all tricks. The first suspect would be a Javascript bug, but it isn’t. It is bug 8161. Your custom content meta_type must not contain spaces. You can fix this on-line by editing meta type in portal_types in ZMI and remove all spaces from meta type name. Subversion global-ignores and .egg-info in Python/Plone developmentPosted on October 3, 2009 by Mikko OhtamaaFiled Under plone, python, technology Subversion does a good job by ignoring most of build/temporary/unwanted files by default. However, there is one exception still existing at least in Subversion 1.6: Python egg folders. All folders whose name ends up with .egg-info should not committed or considered in version controlling actions. your.package.name.egg-info folder is generated inside your Python egg source folder when you run setup.py / setuptools. If you are working with Python source code eggs, add the following line to your ~/.subversion/config global-ignores = *.o *.lo *.la #*# .*.rej *.rej .*~ *~ .#* .DS_Store *.egg-info *.pyc *.pyo .project .pydevproject Otherwise development tools like Mr. Developer might get confused. Plone Developer Manual, take #0.1Posted on September 26, 2009 by Mikko OhtamaaFiled Under plone, technology The first public version of Plone developer manual is available here. It is still very much draft, but I assure you will find it useful. You will find it even more useful after you put in the answers for your own problems. In my previous Plone developer documentation rant my flow of though was little abstract and I couldn’t clearly explain how I want the community to maintain this crucial piece of documentation. This time I made a comic.
Packing and copying Data.fs from production server for local developmentPosted on September 1, 2009 by Mikko OhtamaaFiled Under plone, technology These instructions help you to copy and transfer production server ZODB database (Data.fs) to your local computer for development and testing. This allows you to do the testing against the copy of real data and the production server Plone instance set up. See the original tip by cguardia. Data.fs is ZODB file storage for transactional database. Journal history takes quite a lot of disk space there. Packing, i.e. removing the journal history, usually reduces the size file considerably, making the file lighter for wire transfer. Depending on the database age the packed copy is less than 10% of the original size. These instructions apply for Ubuntu/Debian based Linux systems. Apply to your own system using the operating system best practices. We need ZODB Python package to work with the database. To use it, we’ll create virtualenv Python installation in /tmp. In virtualenv installation, installed Python packages do not pollute or break the system wide setup. Note that you might use easy-install-2.4 depending on the OS. The latest stable ZODB can be picked from PyPi listing. Plone 3.x default is ZODB 3.7.x, which is not available as Python egg, but you can use ZODB 3.8.x. sudo easy-install virtualenv Data.fs cannot be modified in-place. You must create a copy of it to work with it. Data.fs copy can be created from a running system without the fear of corrupting the database, since ZODB is append only database. cp /yoursite/var/filestorage/Data.fs /tmp/Data.fs.copy Then create the following script snippet /tmp/pack.py using your favorite terminal editor. import time
import ZODB.FileStorage
import ZODB.serialize
storage=ZODB.FileStorage.FileStorage('/tmp/Data.fs.copy')
storage.pack(time.time(),ZODB.serialize.referencesf)
And run it using virtualenv’ed Python setup with ZODB installed: /tmp/packer/bin/python /tmp/pack.py Lots of patience here… packing may take a while, but it’s still definitely faster than your Internet connection transfer rate. Verify that the file is succesfully packed: ls -lh Data.fs.copy -rw-r--r-- 1 user user 30M 2009-09-01 13:24 Data.fs.copy Woohoo 1 GB was shrunk to 30 MB. Then copy the file to your local computer using scp and place it to your development buildout. scp user@server:/tmp/Data.fs.copy ~/mybuildout/var/filestorage/Data.fs You just saved about 30-90 minutes of waiting of file transfer. SEO tips: query strings, multiple languages, forms and other content management system issuesPosted on August 7, 2009 by Mikko OhtamaaFiled Under plone, technology This post is collection of search engine optimization tips for content management systems, especially for Plone. Do not index query stringsIt is often desirable to make sure that query string pages (http://yoursite/page?query_string_action=something) do not end up into the search indexes. Otherwise search bots might index pages like site’s own search engine results (yoursite/search?SearchableText=…) lowering the visibility of actual content pages. GoogleBot has regex support in robots.txt and can be configured to ignore any URL ? in it. See the example below. Query string indexing causes the crawler crawl things like
Also, “almost” human readable query strings look ugly in the address bar… Top level domains and languagesUsing top level domain name (.fi for Finland, .uk for United Kingdoms, and so on.) to make distinction between different languages and areas is optimal solution from the SEO point of view. Search engines use TLD information to reorder the search results based on where the search query is performed (there is difference between google.com and google.fi results). Plone doesn’t use any query strings for content pages. Making robots to ignore query strings is especially important if you are hosting multilingual site and you use top level domain name (TLD) to separate languages: if you don’t configure robots.txt to ignore ?set_language links only one of your top level domains (.com, .fi, .xxx) will get proper visibility in the search results. For example we had situation where our domain www.twinapex.fi did not get proper visibility because Google considered www.twinapex.com?set_language=fi as the primary content source (accessing Finnish content through English site and language switching links). Shared formsPlone has some forms (send to, login) which can appear on any content page. These must be disallowed or otherwise you might have a search result where the link goes to the form page instead of the actual content page. Hidden content and content excluded from the navigationAny content excluded from the sitemap navigation should be put under disallowed in robots.txt. E.g. if you check “exclude from navigation” for Plone folder remember to update robots.txt also. In our case, our internal image bank must not end up being indexed, though images themselves are visible on the site. Otherwise you get funny search result: if you search by person’s name the photo will be the first hit instead of biography. Sitemap protocolCrawlers use Sitemap protocol to help determining the content pages on your site (note: sitemap seems to be used for hinting only and it is not authoritative). Since version 3.1 Plone can automatically generate sitemap.xml.gz. You still need to register sitemap.xml.gz in Google webmaster tools manually. There exists a sitemap protocol extension for mobile sites. Webmaster toolsGoogle Webmaster tools enable you to monitor your site visibility in Google and do some search engine specific tasks like submitting sitemaps. I do not know what kind of similar functionality other search provides have. Please share your knowledge in the blog comments regarding this. HTML <head> metadataSearch engines mostly ignore <meta> tags besides title so there is no point of trying fine-tune them. Example robots.txtHere is our optimized robots.txt for www.twinapex.com: # Normal robots.txt body is purely substring match only # We exclude lots of general purpose forms which are available in various mount points of the site # and internal image bank which is hidden in the navigation tree in any case User-agent: * Disallow: set_language Disallow: login_form Disallow: sendto_form Disallow: /images # Googlebot allows regex in its syntax # Block all URLs including query strings (? pattern) - contentish objects expose query string only for actions or status reports which # might confuse search results. # This will also block ?set_language User-Agent: Googlebot Disallow: /*?* Disallow: /*folder_factories$ # Allow Adsense bot on entire site User-agent: Mediapartners-Google* Disallow: Allow: /* Useful resources
Putting views, like sitemap, into Plone content tree using Easy Template add-onPosted on July 30, 2009 by Mikko OhtamaaFiled Under plone, python, technology Plone has two kind of pages
Sometimes it is desirable, for the sake of uniformness, to put view based pages (accessibility, sitemap) into the content tree. For example, one could want to have the sitemap link appearing only in the navigation tree under the site section “About this site”. Plone add-on product Easy Template provides an easy method to show any Plone view(s) on a normal page. Easy Template uses Django like template syntax (Jinja 2 engine). It gives you great power to drop dynamic content easily on pages. Easy Template also has some security awarness ensuring the members using it cannot escape from their sandbox. Easy Template works in WYSIWYG and non-WYSIWYG modes
Example how to show a sitemap on an arbitary Plone page
It turns out to be:
There is no such thing as a “views reference” for Plone. View names and functions can be figured out by searching and reading through ZCML and Python files in Plone source tree. Some developer insight is needed. For example. for sitemap we can do the grep search: grep -Ri --include="*.zcml" sitemap * Then read Products/CMFPlone/browser/configure.zcml and Products/CMFPlone/browser/sitemap.py. The same thing works in portlets. Use Templated Portlet portlet type. See Easy Template PyPi homepage for the full reference of the product’s potential. About the author Mikko Ohtamaa Setup.py sdist not including all filesPosted on July 24, 2009 by Mikko OhtamaaFiled Under plone, python, technology Setuptools has many silent failure modes. One of them is failure to include all files in sdist release (well not exactly a failure, you could RTFM, but the default behavior is unexpected). This post will serve as a google-yourself-answer for this problem, until we get new, shinier, Distribute solving all of our problems. I b0rked the release for plonetheme.twinapex. Version 1.0 package didn’t include media assets and ZCML configuration files. Luckily Python community reacted quickly and I got advised how to fix it. By default, setuptools include only *.py files. You need to explicitly declare support for other filetypes in MANIFEST.in file. Example MANIFEST.in (plonetheme, built in PyDev): recursive-include plonetheme * recursive-include docs * global-exclude *pyc global-exclude .project global-exclude .pydevproject About the author Mikko Ohtamaa XHTML mobile profile transformer and cleaner for PythonPosted on July 23, 2009 by Mikko OhtamaaFiled Under mobile, plone, python, technology Mobile phones, and especially mobile site validators, are very picky about the validy of XHTML. It must not be any XHTML, but special mobile profile XHTML. Also, search engines like Google, will punish you in the mobile search results if your site fails to conform to mobile profile. This is especially troublesome if you display external content (RSS feeds, ATOM feeds) on your mobile site. Incoming HTML cannot be guaranteed to follow any specification. To solve this problem, we have created gomobile.xhtmlmp Python library which helps you to transform any HTML to content to valid XHTML MP. The library is piloted on plonecommunity.mobi site which uses aggregated content from varying sources. The library is based on lxml.html.Cleaner. The library is part of GoMobile project which aims to create world class Python mobile web development tools. Highlights
As an example we integrated gomobile.xhtmlmp to Feedfeeder Plone add-on product. Enjoy. |
