There are lots of code snippets scattered around the internet, but most of them can't
be wired together in a simple flowchart manner. If you look at object libraries that are
designed specifically for that purpose, like Modelica, you can do all sorts of neat
engineering tasks like simulate the thermodynamics and power usage of a new refrigerator
design. Then if your company is designing a new insulation material you would make a new
"block" with the experimentally determined properties of your material to
include in the programmatic flowchart to quickly calibrate other aspects of the
refrigerator's design. To my understanding, Modelica is as big and good as it gets for
code libraries that represent physically accurate objects. Often, the visual
representation of those objects needs to be handled separately. As far as general purpose,
standard programming libraries go, Mathematica is the best one I've found for quickly
prototyping new functionality. A typical "web mashup" app or site will combine
functionality and/or data from 3 to 6 APIs. Mobile apps will typically use the phone's
functionality, an extra library for better graphics support, a proprietary library or two
made by the company, and a couple of web APIs. A similar story for desktop media-editing
programs, business software, and high-end games except the libraries are often larger. But
there aren't many software libraries that I would describe as huge. And there are even
fewer that manage to scale the usefulness of the library equally with the size it occupies
on disk.
Platform fragmentation (increase in number and popularity of smart phones and tablets) has
proven to be a tremendous challenge for continuing to improve libraries. I now just have
15 different ways to draw a circle on different screens. The attempts to provide virtual
machines with write-once run-anywhere functionality (Java and .NET) have failed, often due
to customer lock-in reasons as much as platform fragmentation. Flash isn't designed to
grow much beyond its current scope. The web standards can only progress as quickly as the
least common denominator of functionality provided by other means, which is better than
nothing I suppose. Mathematica has continued to improve their library (that's
essentially what they sell), but they don't try to cover a lot of platforms. They also
aren't open source and don't attempt to make the entire encyclopedia interactive
and programmable. Open source attempts like the Boost C++ library don't seem to grow
very quickly. But I think using Wikipedia articles as a scaffold for a massive open
source, object-oriented library might be what is needed.
I have a few approaches I use to decide what code to write next. They can be arranged from
most useful as an exercise to stay sharp in the long term to most immediately useful for a
specific project. Sometimes I just write code in a vacuum. Like, I will just choose a
simple task like making a 2D ball bounce around some stairs interactively and I will just
spend a few hours writing it and rewriting it to be more efficient and easier to expand.
It always gives me a greater appreciation for the types of details that can be specified
to a computer (and hence the scope of the computational universe, or space of all computer
programs). Like with the ball bouncing example you can get lost defining interesting
options for the ball and the ground or in the geometry logic for calculating the
intersections (like if the ball doesn't deform or if the stairs have certain
constraints on their shape there are optimizations you can make). At the end of the
exercise I still just have a ball bouncing down some stairs, but my mind feels like it has
been on a journey. Sometimes I try to write code that I think a group of people would find
useful. I will browse the articles in the areas of computer science category by popularity
and start writing the first things I see that aren't already in the libraries I use.
So I'll expand Mathematica's FindClusters function to support density based
methods or I'll expand the RandomSample function to support files that are too large
to fit in memory with a reservoir sampling algorithm. Finally, I write code for specific
projects. I'm trying to genetically engineer turf grass that doesn't need to be
cut, so I need to automate some of the work I do for GenBank imports and sequence
comparisons. For all of those, if there was an organized place to put my code afterwards
so it would fit into a larger useful library I would totally be willing to do a little bit
of gluing work to help fit it all together.
Date: Mon, 8 Jul 2013 19:13:54 +0200
From: jane023(a)gmail.com
To: wikidata-l(a)lists.wikimedia.org
Subject: Re: [Wikidata-l] Accelerating software innovation with Wikidata and improved
Wikicode
I am all for a "dictionary of code snippets", but as with all
dictionaries, you need a way to group them, either by alphabetical
order or "birth date". It sounds like you have an idea how to group
those code samples, so why don't you share it? I would love to build
my own "pipeline" from a series of algorithms that someone else
published for me to reuse. I am also for more sharing of datacentric
programs, but where would the data be stored? Wikidata is for data
that can be used by Wikipedia, not by other projects, though maybe
someday we will find the need to put actual weather measurements in
Wikidata for some oddball Wikisource project tp do with the history of
global warming or something like that.
I just don't quite see how your idea would translate in the
Wiki(p/m)edia world into a project that could be indexed.
But then I never felt the need for "high-fidelity simulations of
virtual worlds" either.
2013/7/6, Michael Hale <hale.michael.jr(a)live.com>om>:
I have been pondering this for some time, and I
would like some feedback. I
figure there are many programmers on this list, but I think others might
find it interesting as well.
Are you satisfied with our progress in increasing software sophistication as
compared to, say, increasing the size of datacenters? Personally, I think
there is still too much "reinventing the wheel" going on, and the best way
to get to software that is complex enough to do things like high-fidelity
simulations of virtual worlds is to essentially crowd-source the translation
of Wikipedia into code. The existing structure of the Wikipedia articles
would serve as a scaffold for a large, consistently designed, open-source
software library. Then, whether I was making software for weather prediction
and I needed code to slowly simulate physically accurate clouds or I was
making a game and I needed code to quickly draw stylized clouds I could just
go to the article for clouds, click on C++ (or whatever programming language
is appropriate) and then find some useful chunks of code. Every article
could link to useful algorithms, data structures, and interface designs that
are relevant to the subject of the article. You could also find data-centric
programs too. Like, maybe a JavaScript weather statistics browser and
visualizer that accesses Wikidata. The big advantage would be that
constraining the design of the library to the structure of Wikipedia would
handle the encapsulation and modularity aspects of the software engineering
so that the components could improve independently. Creating a simulation or
visualization where you zoom in from a whole cloud to see its constituent
microscopic particles is certainly doable right now, but it would be a lot
easier with a function library like this.
If you look at the existing Wikicode and Rosetta Code the code samples are
small and isolated. They will show, for example, how to open a file in 10
different languages. However, the search engines already do a great job of
helping us find those types of code samples across blog posts of people who
have had to do that specific task before. However, a problem that I run into
frequently that the search engines don't help me solve is if I read a
nanoelectronics paper and I want to do a simulation of the physical system
they describe I often have to go to the websites of several different
professors and do a fair bit of manual work to assemble their different
programs into a pipeline, and then the result of my hacking is not easy to
expand to new scenarios. We've made enough progress on Wikipedia that I can
often just click on a couple of articles to get an understanding of the
paper, but if I want to experiment with the ideas in a software context I
have to do a lot of scavenging and gluing.
I'm not yet convinced that this could work. Maybe Wikipedia works so well
because the internet reached a point where there was so much redundant
knowledge listed in many places that there was immense social and economic
pressure to utilize knowledgeable people to summarize it in a free
encyclopedia. Maybe the total amount of software that has been written is
still too small, there are still too few programmers, and it's still too
difficult compared to writing natural languages for the crowdsourcing
dynamics to work. There have been a lot of successful open-source software
projects of course, but most of them are focused on creating software for a
specific task instead of library components that cover all of the knowledge
in the encyclopedia.
_______________________________________________
Wikidata-l mailing list
Wikidata-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l