How this site works

This is a mostly static site, with new content being added intermittently. Adding a new piece also requires the relevant links to be added to the indexing page, which can be slightly tedious. To make things easier, we now use some scripts that can generate all the HTML pages in the archive dynamically from XML files containing bengali text. This means that to add a new piece, we only need to get it into the appropriate XML format, put that in an appropriate directory, and run the appropriate scripts. The XML format we use is very simple (it's basically HTML with a few additional bookkeeping tags), and is discussed in more detail below.

How you can help

You are welcome to join the project to provide encouragement, advice, and best of all, more literary pieces. There is no formal way to join, just subscribe to the sourceforge mailing list dedicated to this project, and send off a mail to start talking with other members on the list.

If you would like to contribute a literary piece that you like, please make sure of its copyright status before starting work on it. If there's nothing particular that you want to work on, but you would just like to help out, we would be happy to provide you with scanned images that you can transcribe.

How to write in Bengali

There's obviously the question of how to input Bengali in unicode. Several options are available, which would be best for you depends on what platform you are working on, how much bandwidth you have, and how much new software you are willing to install. These are discussed in more detail here. Please contact us on the mailing list if you have more questions.

XML format and other conventions

XML

We are now storing the documents in an (ad hoc and not yet completely defined) XML format. It would help us if you submit it in this form, but it's not a problem if you don't. All the current files have XML versions on the website, which you can look at as examples. Basically, we have just two tags, <poetry> (meant to contain preformatted text, like poetry and songs) and <p> (for normal text, treated as a paragraph) and some title/author information. Here is an example with both text and poetry.

Formatting conventions

Here are a few general conventions you should follow when typing up documents for inclusion in the Archive:

Quotes

Don't use double quotes ("). Unicode has characters for the left quote and right quote typically used in literature, but, don't use them either. For now, use the grave (`) for left quote and apostrophe (') for right quote. This is mainly because our fonts don't have those characters yet :-P . Eventually, automated code will convert the ` and ' to the correct unicode versions.

periods (daanri) and other punctuations

Bengali daanri-s and double daanri-s are supposed to be represented by the Unicode characters 'Devanagari Danda' and 'Devanagari Double Danda', use these. Your input mechanism should have support for this (if you don't know how to do this, ask us).

Also, don't leave a space before daanri-s (nor before other punctuation marks like , ; ? ! etc)

Dashes and hyphens

Literature often contains long and short dashes. These have unicode points assigned to them, but don't use them (for now) for the same reasons as in 1. Use --- and - for now, we'll eventually convert them to the correct things.

Advanced Formatting

Poems often need to be formatted precisely. Unfortunately, HTML is not always very good at this, and we also wish to retain the ability to produce output in formats other than HTML, which makes too much dependence on HTML and CSS a not very attractive option. This is a problem, and we don't know a good solution yet --- currently we just use spaces to achive as much as possible. If you have any thoughts on this issue, feel free to discuss this on our lists.

Last modified: Thu Aug 5 18:55:56 CDT 2004 by deepayan at stat.wisc.edu