[H-GEN] Linux and high-performance computing

Tue Sep 1 08:03:56 EDT 2015

These are all interesting words and links.

Be aware though that a lot of HPC in Uni environments is distributed
not parallel and this is much better suited to spot pricing and other
entirely non-deterministic schedulers.

So, a lot of time is spent determining how to atomise a large problem.
If you can't another approach is to run your model/experiment across
100 cores with 100 parameters and "monte-carlo"/shotgun your way to
the solution/minimum.

In my particular area (neuro-informatics) the use of HPC tends to be
100 subjects on 100 cores. Or 100,000 gene combinations on a
scheduler.

a

On 1 September 2015 at 21:57, Peter Hall <hall.peter.john at gmail.com> wrote:
> [ Humbug *General* list - semi-serious discussions about Humbug and     ]
> [ Unix-related topics. Posts from non-subscribed addresses will vanish. ]
>
>
> My day job is writing the code for a HPC cluster based around Amazon Web Services - http://aws.amazon.com/ EC2 for compute, S3 for bulk storage, SQS for message passing, DynamoDB for indexes.
>
> For getting started with clusters I'd suggest setting up a toy Hadoop cluster with a few virtual machines. Cloudera have a quickstart VM for getting your feet wet: http://www.cloudera.com/content/cloudera/en/downloads/quickstart_vms/cdh-5-3-x.html once you've got a feel for it refer to the Hadoop documentation and set up a toy cluster from scratch using virtual machines: http://hadoop.apache.org/docs/current/ My Hadoop knowledge is a few years out of date so I only worked with MapReduce, there's also YARN now which looks interesting from a glance at the docs.
>
> To move beyond toy clusters see what you can get access to through uni. Alternately you can get 64 core machines from Amazon EC2 for about 40cents/hour using spot bids. Spot bids are significantly cheaper than the standard (on demand) price, but have the disadvantage that Amazon can decide to shut them down at any time if someone else offers to pay more.
>
> Other projects you might like to have a look at:
> Apache Spark - http://spark.apache.org/
> Apache Mesos - http://mesos.apache.org/
>
> Cheers,
> Peter
>
>
>
> On 1 September 2015 at 06:55, Benjamin Fowler <ben.fowler.bjf at gmail.com> wrote:
>>
>> [ Humbug *General* list - semi-serious discussions about Humbug and     ]
>> [ Unix-related topics. Posts from non-subscribed addresses will vanish. ]
>>
>>
>> Hello,
>>
>> Does anybody have any suggestions on how I might get familiar with modern high-performance computing architectures?
>>
>> The reason why I'm asking, is that I'm working my way slowly though a physics degree and I'm starting to think about how I might apply my coding skills to large and interesting problems -- which suggests that I might need to get up to speed on working on software that executes on machines with very large numbers of cores, strange (e.g. NUMA) architectures, message-passing, and what not....
>>
>> Any suggestions on where I might start to get up to speed, to get the "lie of the land" and start exploring the subject? Does anybody have any recent (e.g. in the last 5 years) experience with doing said development at home (e.g. what we used to call "Beowulf")?
>>
>> Cheers, Ben.
>>
>>
>> _______________________________________________
>> General mailing list
>> General at lists.humbug.org.au
>> http://lists.humbug.org.au/mailman/listinfo/general
>>
>
>
>
> --
> Trapped in signature factory please send help
>
> _______________________________________________
> General mailing list
> General at lists.humbug.org.au
> http://lists.humbug.org.au/mailman/listinfo/general
>