|Ghostscript projects seeking developers|
For other information, see the Ghostscript overview.
There are many projects that would improve Ghostscript and that we would like to do, but for which we don't have enough resources. If you would like to take responsibility for any of these projects, please contact us. Additional comments on implementation approaches or project goals are in italic type like this.
We would like Ghostscript to work with the free emx/gcc and rsx libraries, to provide an alternative DOS, Windows 95/98/NT, and OS/2 implementation that requires no proprietary, commercial compilers. We think Ghostscript's existing OS/2 makefile already includes most of what is needed. If someone is willing to do the work, we will be happy to include this in our list of supported platforms and to distribute the makefiles. If interested, please consult
MS Windows has a "language monitor" capability which would allow Ghostscript to be invoked seamlessly to process input files in any language Ghostscript could handle (currently PostScript and PDF) and for any printer for which Ghostscript had a driver. Doing this properly would require integrating Ghostscript with Windows' "Add Printer" dialog, and would also require creating a PPD for Ghostscript. Russell Lang's RedMon program provides some, but not all, of this capability.
Currently, Ghostscript can work as a "helper application" for the Netscape browser, but not as a plug-in; the latter would integrate it more closely with the browser. We aren't sure what doing this would involve; we've also heard by rumor that it's already been done.
Currently, drivers can be written so that converting PostScript to a list of graphical objects can run in one thread, and rasterizing the objects can run in another thread. However, drivers must be written specially if they are going to do this. We would like to change the architecture so that any driver can work this way. We would also like to support dual-threaded operation for drivers that produce high-level output, such as the PDF writer. Doing this would require separating banding from the multithreaded logic. Also, currently each thread has its own allocation pool: this is unnecessary in the normal case, since Ghostscript now supports properly locked access to the C heap, but embedded systems still need to use a fixed-size area for the rasterizing thread. With a locked, shared allocator, the rasterizing thread could use the full set of band list functions; with a fixed-size area and a separate allocator, only a subset is available, as is the case now for dual-threaded drivers.
Currently, drivers must be linked into the executable. We would like to be able to load drivers dynamically. Doing this requires defining a platform-independent API (presumably extending the current gp_* APIs) that would work at least on Linux, vendor Unix, MS Windows, and Macintosh. Unix systems should include Sun, HP, AIX, IRIX, DEC; Linux ELF and a.out formats should both be supported. Consider the Netscape plug-in architecture.
The PostScript 'setpagedevice' function implements matching of media and page size requests to available media, page orientation, and paper handling (duplex, etc.) Currently it is implemented in PostScript code, which means it is not available for use with other input languages. (It is available for PDF, which Ghostscript implements on top of PostScript, but not for the not-yet-freely-available PCL interpreters that use the Ghostscript library, or for possible future SVG or similar interpreters). We would like to move this function into C. The device driver will be required to send page parameters up to PostScript to be stored in a resource. To be included in this project are handling policy implementations in the device drivers. DeferredMediaSelection should also be implemented.
In a few cases, it would be desirable to provide a 'tee' capability for drivers: specifically, for generating small, low-resolution 'thumbnail' images concurrently with other output. Probably the simplest way to do this is to generate a band list and then process it twice. This is not completely trivial, since the band list does include device resolution information and scaling would be required for some constructs.
Each available output device should provide an instance of the OutputDevice resource category, which gives the available page sizes, resolutions, media classes, process color models, and other information about the device. This would replace the current non-standard use of a 4-element PageSize in the InputAttributes entry of the page device dictionary.
Currently, the maximum length of the OutputFile parameter is a compile-time constant, gp_file_name_sizeof. This is appropriate for ordinary file names, since this constant is the platform's limit on the length of a file name. However, if OutputFile is a pipe, the length should not be limited in this way. This is probably a small project: it requires allocating the file name dynamically, and freeing it in the finalization routine that gets called when a driver instance is freed..
We would like to provide (Adobe) PrintGear and (H-P) PPA output drivers for Ghostscript, but the specifications for these protocols are not published. If you can provide them to us without violating any agreements, please let us know. (Some work has already been done on reverse-engineering these protocols, but we don't have references to it.)
We would like to improve the high-level PostScript-writing pswrite driver to bring it up to parity with the PDF-writing driver (including the many improvements in the latter being implemented in Ghostscript 6.xx). Specifically, we want it to write text as text rather than bitmaps, and to consistently write images in their original high-level form. We have already started to factor out code that should be common to these two drivers, specifically for writing embedded fonts and compressed data streams.
There is one small part of this project that would be especially valuable and could be done independently (although it might have to be partly or entirely redone later): compressing images. Currently the driver only compresses character bitmaps, and doesn't compress other images at all. It should use the CCITTFaxEncode filter for 1-bit-deep images, and plane-separated LZWEncode compression for color images, using the miGIF algorithms that are believed to be free of the Welch patent for the latter. Even better, it should try several methods on each image and use the one that works best.
The PDF-writing driver needs to look at some of the DSC comments in the PostScript input. We have a plan and design for a general DSC-reading capability, but we need help implementing a full DSC parser and also help implementing the specific set of DSC comments that the PDF writer will need to consider. This set is currently specified very poorly in the Adobe Acrobat Distiller Parameters documentation: we are working with Adobe to find out the details.
Currently, the PCL 5 drivers produce only bitmaps; the PCL XL driver produces high-level graphics and sometimes high-level images, but low-level text. We would like to improve these drivers to produce higher-level, smaller output. This was a very low-priority project; it has become more important now that H-P's laser printers are shipping with less memory.
We would like a "GDI driver" for MS Windows that would implement more higher-level constructs (specifically for text). The mswin and mswinprn drivers both do some of this. Some of the the 'xfont' support code for MS Windows should be useful. We were frustrated in the past because the GDI calls for getting font sizes and metrics consistently returned incorrect information and provided no way to get the correct information; perhaps this has been fixed in 32-bit Windows. We believe that H-P, Russell Lang, and perhaps others are working in this area, but we can always use more help.
The PDF writer needs to be able to generate thumbnails (small previews). We might do this through the 'tee' capability mentioned above. However, we currently prefer the idea of implementing a completely separate program to add thumbnails to an arbitrary, existing PDF file: this would allow Ghostscript to add thumbnails to PDF files generated by other programs. Much of the code needed to do this has already been written for Ghostscript's PDF linearizer: see lib/pdfwrite.ps. A user has implemented this as well, using a separate program that calls Ghostscript: see http://www.uni-giessen.de/~g029/eurotex99/oberdiek/.
In addition to factoring out the error diffusion code as described below, we would like to see another attempt at reducing the enormous volume of code for color inkjet drivers. There are three sets of drivers (gdevcdj.c, gdevstc.c, gdevupd.c) with much overlapping functionality. The latter two driver families make good attempts at factoring out things like head geometry and canned control strings, but we think this problem deserves another pass, especially in the hope of consolidating these drivers into a single family.
See below under "Notification for glyph decaching."
Currently, all images are decompressed by the interpreter before being passed to the graphics library; the PDF writer may then compress them again. Ordinarily, this only slows things down a little, but in the case of DCT-encoded images that are being DCT-encoded in the output, image degradation may occur. Ideally, the implementation should be smart enough to not decode and re-encode the image. However, making this work properly is difficult. This would probably involve extending the library APIs for images so that they could pass a stream, possibly including filters, instead of the (fully decoded) data rows.
Currently, the library supports a maximum of 32 bits of data per pixel; we would like to raise this limit to 64 bits on systems where the 'long' data type is 64 bits wide. The gx_color_index type is already defined as 'long', but there are many places where the type bits32 is used for pixel values; there is a 32-bit stored-image "device", but there is no 64-bit device; a few algorithms and tables have knowledge of the 32-bit width built into them, only because the C preprocessor doesn't have any kind of loop or repetition capability.
The PostScript specification includes an option for the interpreter to implement trapping (adjustments of object boundaries to prevent visual anomalies caused by slight misregistration of different ink layers): we would like to implement this. This is a complex and difficult area; even many Adobe RIPs don't do it.
We would like to provide an option for good integration of the FreeType font rasterizer into Ghostscript, since the FreeType code is better than Ghostscript's current rasterizers. We understand that some users in Japan have done this already, but we don't know if this is a good starting point (the only documentation is in Japanese). Their work uses Ghostscript's 'xfont' interface, which is how Ghostscript interfaces to platform facilities such as the X and MS Windows font capabilities. This is a device-level interface and not the best place to do this: we would rather have the graphics library interface to FreeType directly.
PDF 1.3 requires direct support for ICC color profiles in the form of a new ICCBased color space family; we believe Adobe will add such support to PostScript as well. We would like to integrate such support into the graphics library. One possible approach is to interface with the existing CS/CRD library code so that optimizations and improvements would carry into ICC color as well as CIE color. However, we've been told that the icclib package at http://web.access.net.au/argyll/color.html is a good one, and it handles all of its own color mapping. We would suggest that anyone working on this project start by examining that package.
Currently, knowledge of the specific data formats and algorithms for halftoning permeates too many places in the library. We would like halftoning to be more "object oriented" (using virtual procedures) so that we could support other halftoning methods such as direct use of threshold arrays, or the double-rectangle approach added in newer PostScript versions. Threshold arrays take much less space than the current representation, generally at the expense of longer rendering time for black-and-white images; double-rectangle representation would give us a better implementation of AccurateScreens. We might want store both threshold arrays and the current representation.
Currently, several different inkjet drivers implement their own, very similar but slightly differing error diffusion methods. This has caused severe code bloat as well as tempting future driver writers to contribute to it further. We want to factor out error diffusion into a common set of facilities that drivers can use. We would like to design these facilities so that they can easily interface to the Even-Toned Screening algorithms from artofcode (Raph Levien), to the extent that these will be Open Source.
The Ghostscript distribution includes a stochastic threshold array. This array has some gamma correction built into it, which works well for some output devices and not for others. We would like to provide a version of this array without (or with less) gamma correction. We have original data available from which this could be done fairly easily.
The PostScript language defines many functions relevant to graphics rendering as being implemented by arbitrary PostScript procedures: transfer (gamma correction), black generation, undercolor removal, several stages of CIE color space and rendering, and color mapping for Separation and DeviceN spaces. Since the graphics library can't call PostScript procedures, Ghostscript currently samples these procedures at a fixed number of points and interpolates linearly between the samples. As of Ghostscript 6.20, the library can interpret a restricted subset of PostScript procedures directly (basically those that only use arithmetic and comparisons: no loops, sub-procedures, or data structures). Changing the rendering functions to use this approach when possible would greatly improve output quality when the functions are very non-linear (which we have actually seen in practice). This should only be done if the function is, in fact, severely non-linear, since interpreting the function definition will almost always be much slower than interpolating in the table.
Currently, if a CIE rendering dictionary uses a lookup table for the final step, Ghostscript always interpolates linearly between the entries. Cubic interpolation should be supported as an option. A cubic interpolation option is also needed for general table-lookup Functions.
Ghostscript has partial support for alpha channel and for alpha and RasterOp compositing. There is some architectural support for general compositing, but it postdates the RasterOp implementation, and most of the RasterOp code doesn't use it. We expect that the more extensive compositing and alpha capabilities of SVG will find their way into PDF (and probably PostScript as well) in the course of 2000 and 2001, and we will need to implement them.
Currently, when Ghostscript uses a band list, it does halftoning before banding. It should do halftoning after banding: this produces smaller band lists and shifts more work to the rasterizer (which is good because the rasterizer can be multi-threaded internally for higher performance on multiprocessors: see the next topic.)
When smoothed ("interpolated") images are written in the band list, extra rows must be written above and below each band in order to provide the data for interpolation. Currently, the number of such rows is computed very conservatively; instead, the final interpolation algorithm should be consulted to provide the correct value. This is a small task.
For high-resolution devices, rasterization dominates execution time. On multiprocessor systems, Ghostscript can do tasks in parallel:
We would want these facilities implemented so that no conditional compilation was involved: on uniprocessor systems, the locking API would simply have a vacuous implementation.
Currently, drivers can't do a very good job of downloading rendered character bitmaps to the device they manage, because they can't find out when a bitmap is being deleted from Ghostscript's cache and therefore will never be referenced again. Here is a sketch of how we would add this capability to the graphics library:
gs_glyph_decache_register(imager_state, notify_proc, proc_data)
where proc_data was, or pointed to a structure that included, a pointer to the driver.
whenever a bitmap was removed from the character cache. pchar_data would point to some identification of the character; perhaps just the bitmap ID, but possibly a gx_cached_bits_common or even a cached_char.
This facility was requested by the Display Ghostscript project, but it could also be used to improve the output of the PCL XL driver and possibly the X and PCL5 drivers.
There is a project to create a GNU implementation of the OPENStep API, which involves extending Ghostscript to provide the full functionality of Adobe's Display PostScript system with some of the NeXT extensions. For more information, please contact Net-Community <email@example.com>.
For full Adobe PostScript compatibility, Ghostscript needs a real "job server" to encapsulate the execution of PostScript files. See the section on "Job Execution Environment" in the PostScript Language Reference Manual for details.
Ghostscript could be adapted with some work to read SVG. This would be an interesting and challenging project because SVG's graphics model would require extending the library (see above). If SVG turns out to be an important standard, it is important that there be a good free implementation of it.
Currently, the %font% IODevice is not implemented. We would like to see this implemented using a general framework for implementing IODevices (%xxxx%) entirely in PostScript, in an "object oriented" manner very similiar to the way Resource categories are implemented. An IODevice would be implemented as a dictionary with the following keys, whose values would be procedures that implemented the corresponding operation:
/File /DeleteFile /RenameFile /Status /FileNameForAll /GetDevParams /PutDevParams
There would only be global IODevices, no local ones; the dictionary keeping track of them would be stored in global VM.
This is an obscure feature that matters only because some PostScript code uses filenameforall with this IODevice, rather than filenameforall with the /Font Resource category, to enumerate available fonts.
Adobe Acrobat Reader can scan a PDF file that has had its end-of-lines converted by careless users transferring the file across operating systems as text rather than binary across, and reconstruct the cross-reference table which the PDF interpreter requires. This only works if the file has no binary data in it, which with PDF 1.3 is rarely the case. However, users occasionally receive PDF files that have been damaged in this way, and it might be useful to have a program that can repair them. We think this should probably be done as a separate program, possibly in PostScript, similar to Ghostscript's PDF linearizer.
Currently, neither the PostScript interpreter nor the graphics library is fully re-entrant (no writable globals). Making them fully re-entrant would make Ghostscript usable in multi-threaded environments, and more easily usable in embedded environments. Note that this is necessary, but far from sufficient, for Ghostscript to allow simultaneous execution of a single Ghostscript interpreter instance by multiple threads: that is probably permanently out of the question. Almost all drivers, including all of Aladdin's own drivers, are already fully re-entrant; making the remaining ones re-entrant should really be up to the driver author.
Copyright © 2000 Aladdin Enterprises. All rights reserved.
This file is part of AFPL Ghostscript. See the Aladdin Free Public License (the "License") for full details of the terms of using, copying, modifying, and redistributing AFPL Ghostscript.
Ghostscript version 6.50, 2 December 2000