Sharing a Database - Ideas Regarding Standards for Developers

I'm working hard on creating a system to share an online activism database between as many websites as want to join in (it's looking like a beta will come out in January - email me if you want to help test it!).

So far, as one of the very few individuals involved in sharing activist data and putting it online, I've been deciding on my own what information to track. With sharing it gets fun because people will have different ideas on what we should be tracking.

I see several options.

1. We can require that everyone tracks the same data. The reason to do this would be to avoid having incomplete data. For instance, if you were tracking student activists and some websites didn't ask for graduation dates - you'd run into trouble. I learned in sociology, doing linear regressions, that missing data is messy.

2. Require that everyone prompt the users to input the same set of data, but give people freedom as to what parts of the data set they'll display. The advantage of this is that you'd have a consistency to the data, but people could develop their own interfaces (or "skins") and users can choose whichever interface best meets their needs/desires.

3. Require that developers have a common set of things that we display. For instance, if you are displaying a person's record we could agree that we'd always display the groups that they are connected to (whereas displaying their speaker topics, or campaign updates, or friends could be optional - to save space/make things simpler). We could choose to do this because this is a relational database and if you make it single-dimensional you lose most of its power.

4. Require a base set of fields (ex. first name and an email address for each person), however let developers have a significant degree of freedom in which fields they will prompt users for. The idea is that developers will have different priorities. Somebody might want to push a friendster-style interface and be really interested in getting people to add all of their friendships. Whereas other people might want to push activism, and get the person to affiliate with issues, campaigns or even ideologies (possibly even taking the politicalcompass or some type of quantitative left/right political test). This might be necessary if developers cannot agree on what their goals should be. It could be a good thing if experimentation leads to productive innovation. It could be bad if things get too complex.

5. Total freeform. People can share whatever they want. This could get really messy if people are changing the data that they are tracking or changing their interfaces (the web services that other developers use). Or if people define things differently - for instance if people call the US occupation of Iraq as a different issue. You need unique identifiers for issues, people, groups, and anything else that is the same.

Implementation
These recommendations could be hardwired into the software - I think (ex. you could refuse to validate a new person if it didn't contain certain fields). But I think that for the display-oriented criteria it mostly makes sense to have a group of people strongly recommend criteria.

-Aaron-