Email: dev@concena.com       Twitter: @dbrucegrant
Nuxeo Event Trigger - firing a simple one-time event

I see that in Nuxeo 5.6 the SchedulerService has moved packages and undertaken some renaming. I disovered this after looking further into a question on Nuxeo Answers.

In order to use the Scheduler Service effectively it would be extremely helpful if the underlying SchedulerServiceImpl registerSchedule method could handle the case where a CRON expression (in the schedule object) is null. When the CRON expression is null then a SimpleTrigger could be created that could be run 1 .. n times.

Something like...

if(cronexpression==null)
  use SimpleTrigger
else
  use CronTrigger
}

This would nicely support one-time events created in code without having to use a CRON expression. Would be even nicer if there was a way to automatically remove the one-time scheduled service on firing to avoid having to manually unregister. Maybe there's an easier way?

First Impressions - Nuxeo 5.6

Last week I downloaded a Nuxeo 5.6 build. After downloading, it took me only a few minutes to get the instance up and running - pure vanilla with the exception of using Postgresql 9.1 as the database. I have only had a bit of time to play with the release and explore some of the changes but I have to say that my first impressions are very favourable.

  • Running Nuxeo 5.6 on the same server as I run 5.5 using the same components (DM, Collab, and DAM) the 5.6 server started slightly faster (this was not a scientific comparison but the results seem consistent); the start time on the server went from 21 sec (5.5) to about 19 sec (5.6); server spec - see below
  • I felt the responsiveness (performance of the UI) was improved significantly (again this is my out of the box impression -- I may be crazy). I did find that the default navigation tree jumped a little (like a pixel shift) and that was a bit disconcerting. However, remember I am not looking at released code and so there is probably much last minute testing and tweaking underway in the Nuxeo development bunker
  • The UI takes on a much more polished look than its predecessors. This is reflected in updated graphics and in Action menus that just look more professional (it's too bad this new look doesn't seem to extend to sub-actions - at least out of the box - although I haven't dug under the covers and maybe this is easily configured)
  • Reorganization of the default DM UI makes for (in my opinion) better use of available space - some of that's achieved with icons and some is achieved through a combination of better contrast and better presentation of simple data (like current document state and contributors)
  • Every time I visit the Admin Center there is cool (and more importantly useful) functionality exposed - I only sped breifly through this section - enough time to see a number of new tabs, functions and the same UI improvements as DM
  • I didn't, however, see any outward facing improvements in the DAM interface (anybody help here - am I missing something?)
  • The cookie trail seems to have shrunk in size, which personally I like, because I always like to see the entire path up there, but it may be too small for some
  • One part of the UI I would really like to see streamlined is security administration. This is one part of the UI that has been with the product for as long as I have been using it, and in my opinion it's just clunky - needs to be more intuitive and easier to manage. Not simple I know, but given the other UI improvements hopefully somebody will turn their attentions to this area of the UI in an upcoming release
  • Really like the new icons although it will take a bit to remember which one is which (this isn't a complaint, rather an observation)
  • Just haven't had time to make it through Social Collaboration in any detail but the new dashboard looks great and although it may seem trivial, I really like the idea of a built-in voting system
  • I know there are numerous internal improvements (some obviously reflected in ui and load speed) to look forward to, but not this time around

The pre-release code seems extremely stable - out of the box, running in less than 15 minutes, and no crashes! Spending a few hours looking at 5.6 scratched the surface but was well worth the investment in time.

Server Spec: i7, 8 cores, 32GB RAM, SSD Boot/App disk, 7200RPM SATA for data, running CentOS 6.2 64 bit.

Nuxeo Roadmap

I had signed up to attend, but ended up missing, the Nuxeo roadmap session. But my customers come first and I had to take of a few time-sensitive issues. C'est la vie. The session is up on the site so at least I could watch after-the-fact. Apart from what I have previously written about with respect to DAM, I had some more general questions about the future of Nuxeo that I would have liked to ask. At least one was answered during the webinar and another touched on briefly.

  1. There are many third party libraries that are part of Nuxeo. Some of these libraries are fairly old (in technology terms) and I think they need to be updated much more frequently. Libaries for PDF, image manipulation, metadata extraction, and more need regular updates. I would like to know when/if these will become part of the regular hotifx / release cycle.
  2. Integration of Studio and IDE with full two-way engineering. The ability to move back and forth between the environments would have great value. It looks as though 5.6 will start down the path but I will have to wait and see whether there is value in the initial offering.
  3. Whether on premise or in the cloud, performance is of critical concern - especially with huge digital assets. I would like to know specifics on plans to improve overall imaging and asset management performance.
  4. Nothing specific in mind, but I would like to see fewer error pages popping up in the application. And when they do it would be nice to have a way to return to the context prior to the application cacking. Just wondering if there is any work under way to tighten up and improve error handling.
  5. More into the future - any thoughts or plans to support any NoSQL database variants?

I was glad to see more work in the mobile space as this is critical in my opinion to delivering focused, effective content applications to users regardless of where they are!

Hopefully I get a chance to attend the next roadmap session.

Beyond Nuxeo 5.6

Looks like the 5.6 release of Nuxeo has some promising ui/usability improvements. No doubt there will be improvements in other areas as well. Hopefully I will have some time in the not too distant future to investigate.

I'm looking beyond 5.6, specifically with my new DAM goggles on. There is demand for industrial strength DAM across most verticals. In some verticals, digital asset management underpins the business itself, and in these situations industrial strength DAM is critical. I would like to see a stronger focus on improving the DAM product (and underlying CAP/imaging components).

I think it's time for Nuxeo to step up efforts to improve DAM over and under the covers - to get it on a level playing field with commercially available DAM solutions.

What do i think are the most important areas on which to focus?

1. Format agnosticism.

Whether it's a raw image from a Nikon camera, a multi-layer PSD, or one of a myriad of other formats the DAM solution needs to be able to handle it,

2. Size agnosticism.

Industrial strength DAM products must handle 1.5GB Tiff files just as eloquently as they handle a 5MB Jpeg. While processors, RAM, and disk speed will determine overall performance, the solution should have workarounds for any reasonably sized hardware or virtual machine. This also means finding and eradicating all software-based limitations for handling files > 2GB in size.

3. Metadata - all of it.

Filtering and finding images is one of the most important functions (in my opinion) of a digital asset system. Using image metadata for searching and filtering requires efficient automated extraction. And, it means that every type of common metadata must be easily extracted from supported image types. This includes, but is not limited to, XMP, IPTC, and EXIF.

4. Performance Profiling data and Sizing Charts

The challenges with moving and processing multi-GB files are much different than those of smaller files (where transaction times - even aggregrated - or not all that significant). It would be extremely useful if Nuxeo published recommended configurations for all Nuxeo applications, but DAM specifically. The recommendations could include JVM sizing, Database configuration, OS configuration, Imaging component tweaks, RAM, disk usage, etc. This is a big complex area, but some base level of recommendations would be helpful as a starting point.

5. Speed, Speed, Speed

With big images, and lots of them, processing speed is critical. I have seen a number of opportunities in the core Nuxeo imaging code to improve overall transaction speed. One of the biggest areas for possible improvement is the imaging library - especially when spinning off multiple Jpeg versions of the original image. Lots of opportunity here.

6. Third party Library Updates - More Frequent/Alternates/Plugins

Libraries that support third party image processing (ImageMagick, metadata-extractor, etc.) have to be kept up-to-date, preferably as part of point releases. This ensures that newer image formats can be handled with minimal fuss. Maybe it's time to look at alternatives to some of the image processing components. Is Google's OiiO a possible alternative to ImageMagick? Is there a way to abstract the third party image library in such a way that OSX users could take advantage of the CoreImage library for image processing (which is far faster than ImageMagick equivlents)?

[Added May 28 2012:

7. Parameter driven functionality

An example... When an asset is ingested into the DAM repository three images are generated - an original size JPEG, a Medium size JPEG, and a thumbnail. The number of images generated and their sizes are hard-coded in the application. It would be nice if the configuration of this functionality were exposed in a configuration document, allowing additional renditions, lower resolution renditions, different formats, etc. In my opinion this would greatly improve the flexiblity and applicability of DAM.]

What else can Nuxeo improve in the current DAM product to make it more industrial strength?

Cheers,

Bruce.

Nuxeo, DAM, ImageMagick and really big files

Dam frustrating is a good way to describe my last three days. I have been working to tune the Nuxeo DAM import for very large images … frustrating because there are many pieces, moving big images takes time, and debugging/document the whole process introduces a natural latency.

Nuxeo import relies on ImageMagick for image-related tasks (e.g. resizing and cropping) - during import ImageMagick is used heavily and is by far the most time-consuming element of the import process. If ImageMagick slows then so does your import.

Nuxeo DAM, for that matter the core imaging code, works really well with most images. "Works really well" extends to the integration with ImageMagick. But take a step into the world of very large/complex image files and the works well fantasy world takes a halting step into reality.

What do I mean by very large image (or complex) files? Three possible interpretations:

In absolute terms, any image file over 2GB is very large (well, technically anything over 2^31-1, which is the limit of an int value in Java). This limit is important because any code that uses int values for file size or related processing will be limited to 2GB files. There are still some limitations within Nuxeo when it comes to uploading very large files (not just images) - e.g., in DAM I can import a 3.5GB video file through the importer but the same import fails when attempted through the UI.  I do know that Nuxeo developers have been working actively to remove these barriers.

In relative terms, very large can be any image that is big enough (or the requested operation complex enough) to require more physical memory during image processing than your OS can make available. For example, images in the 1GB range can be a problem on a Windows box with 4GB of RAM (assuming Nuxeo server is also running). ImageMagick runs fastest when the entire pixel map can be loaded into RAM.  If the entire pixel map can't be loaded into memory then image processing slows dramatically – that’s because ImageMagick will page parts of the image in/out of memory to complete the requested task.

Complexity of the image file itself can exacerbate the situation. For example, multi-layered PSD images require far more processing time than a single layer file of the same size. To the ImageMagick identify command each layer is effectively an image of its own, with its own metadata. So it takes ImageMagick much longer to troll through multiple images to extract the required information.

Add to this complexity the differences in memory allocation/reservation of different operating systems and it makes it challenging to tune Nuxeo DAM and ImageMagick.

I have spent the last three days running through numerous configurations, file sizes, limits, memory allocations, etc. trying to better understand the issues in order to move forward. What I found is...

  • So long as you're running VCS then Postgres tuning is not really important for image processing (of course it is very important for other reasons)
  • Technically, the amount of RAM you set aside for Nuxeo will not limit the size of images you can import, but practically speaking, if you don't allocate enough RAM to reflect the size and volume of images imported, displayed, etc. in your system then your DAM solution will be less than useful. The RAM required is dependent on numerous other factors so it's difficult to be entirely prescriptive
  • The amount of physical RAM available to ImageMagick is absolutely critical to timely image processing. Not enough RAM and ImageMagick will page to disk, run times will increase dramatically and it's likely your Nuxeo transaction will timeout. The end result … nothing will be imported (the transaction will be rolled back). You might want to set upper limits on the resources available to ImageMagick (using environment variable) so all available RAM isn't gobbled up
  • The speed of your disk is also important. Slow disk = slower image processing. Longer to read, longer to write, longer to create any interim temp files. If your disk is on the network then make sure it's on a high speed connection. You might want to consider a solid state drive for ImageMagick temporary files.
  • CPU speed, number of cores, and number of threads allocated (for ImageMagick) must also be considered - especially from a global perspective, since multiple image commands can be running simultaneously.
  • The location of ImageMagick temp files is important. ImageMagick commands produce temp files, the location of which can be controlled with the MAGICK_TEMPORARY_PATH environment variable. Set the path variable to a disk with lots of free space, preferably not the same disk as your OS and Nuxeo are on.

Each case is difference, but the end result for Nuxeo-ImageMagick import tuning: expect to spend extra time understanding your requirements, get ready to buy more memory, and be prepared to put in some effort to get everything running quickly!

Showing 11 - 15 of 41 results.
Items per Page 5
of 9

Recent Entries Recent Entries

RSS (Opens New Window)
Showing 1 - 5 of 15 results.
of 3