Welcome to curated list of handpicked free online resources related to IT, cloud, Big Data, programming languages, Devops. Fresh news and community maintained list of links updated daily. Like what you see? [ Join our newsletter ]

Apache Kafka is not for event sourcing

Categories

Tags software-architecture apache streaming big-data machine-learning

Jesper Hammarbäck article in which he argues why Kafka is not the best tool for event sourcing. Kafka is a great tool for delivering messages between producers and consumers and the optional topic durability allows you to store your messages permanently. Forever if you’d like.

The Event Sourcing pattern defines an approach to handling operations on data that’s driven by a sequence of events, each of which is recorded in an append-only store. Application code sends a series of events that imperatively describe each action that has occurred on the data to the event store, where they’re persisted.

He argues that Kafka is a great tool for event store or an event log, but it really isn’t a suitable tool for event sourcing.

His arguments are around:

  • Loading current state - fast forwarding to current state is not easy in Kafka
  • Consistent writes

He concludes that Kafka might be a good complement to your event store as a way of transporting events to downstream query services or read models

To learn more, read this interesting article.

[Read More]

Making sentiment analysis easy with Scikit-learn

Categories

Tags big-data machine-learning

Lesley Cordero article about sentiment analysis. Sentiment analysis uses computational tools to determine the emotional tone behind words. Python has a bunch of handy libraries for statistics and machine learning so in this post we’ll use Scikit-learn to learn how to add sentiment analysis to our applications.

Scikit-learn is a Python module with built-in machine learning algorithms. Author uses the Logistic Regression model, which is a linear model commonly used for classifying binary data.

Further in this article:

  • Environment setup for Python 3.6
  • A quick note on Jupyter
  • Preparing the data
  • Linear classifier using the LogisticRegression
  • Note on model accuracy

You can build a classifier with less than 50 lines of Python code and no math. This is a good starting point for anybody interested in machine learning.

[Read More]

Storing data in DNA

Categories

Tags database programming

MIT technology review article about storing data in DNA and how this can be a lot easier than getting it back out. Humanity is creating information at an unprecedented rate—some 16 zettabytes every year (a zettabyte is one billion terabytes). Last year, the research group IDC calculated that we’ll be producing over 160 zettabytes every year by 2025.

Researchers have long known that DNA can be used for data storage. What’s impressive for computer scientists is the density of the data that DNA stores: a single gram can hold roughly a zettabyte.

Bacteria often carry genetic information in the form of tiny circular rings of double-stranded DNA called plasmids

The idea is simple:

  • Store data in plasmids inside bacterial cells that are trapped in a specific location
  • To retrieve send motile bacteria to this site
  • Conjugate with the trapped bacteria and capture the data-carrying plasmids
  • The motile bacteria carry this information to a device

But nobody has come up with a realistic system for storing data in a DNA library and then retrieving it again when it is needed. Innovation at your fingertips.

[Read More]

5 things every developer should know about software architecture

Categories

Tags software-architecture programming

Simon Brown post on InfoQ site regarding things developers should know about software architecture. Even now, it seems that software development teams are still struggling with some of the basics, especially those aspects related to software architecture.

On the contrary, a good software architecture enables agility, helping you embrace and implement change.

Some interesting notes and recommendations:

  • Software architecture isn’t about big design up front
  • Every software team needs to consider software architecture
  • The software architecture role is about coding, coaching and collaboration
  • A good software architecture enables agility

There still exists a common misconception that “architecture” and “agile” are competing forces, there being a conflict between them. On the contrary, a good software architecture enables agility, helping you embrace and implement change. Many teams today still implement “architecture indifferent design.” In other words, they adopt an architectural style without necessarily considering the trade-offs. Good read.

[Read More]

In defence of swap -- common misconceptions

Categories

Tags cloud programming

Chris Down lengthy post about swap and how it is a useful tool to allow equality of reclamation of memory pages, but its purpose is frequently misunderstood, leading to its negative perception across the industry.

There are different types of memory in Linux, and each type has its own properties. Understanding the nuances of these is key to understanding why swap is important.

Article provides answers on common questions:

  • What is the nature of swap
  • What happens with / without swap
    • Under no / low memory contention
    • Under moderate / high memory contention
    • Under temporary spikes in memory usage
  • How to go about tuning

And remember disabling swap does not prevent disk I/O from becoming a problem under memory contention, it simply shifts the disk I/O thrashing from anonymous pages to file pages. Great read!

[Read More]

Neural networks, manifolds, and topology

Categories

Tags big-data machine-learning

Christopher Olah older article about excitement and interest in deep neural networks because they’ve achieved breakthrough results in areas such as computer vision.

However, there remain a number of concerns about them. One is that it can be quite challenging to understand what a neural network is really doing. It is much easier to explore low-dimensional deep neural networks – networks that only have a few neurons in each layer.

Author then provides more information on:

  • Simple neural network example
  • Continuous visualization of layers
  • Topology of tanh layers
  • Topology and Classification
  • The Manifold Hypothesis

Excellent article with still relevant information for anybody interested in neural networks.

[Read More]

Getting started with Rust on the command line

Categories

Tags programming

Florian Gilcher intro into Rust (rustlang) on the command line. It is for people with no previous knowledge in Rust. It also assumes some knowledge about programming, but none about Rust.

Author would like to present why Rust is a feasible option, by writing a small, but useful command line tool. Author will show you how to write a small program that fetches the JSON feed, parses in and outputs it on the console in a formatted fashion.

The article further covers:

  • How to install Rust with rustup
  • How to set up project with cargo – build manager for Rust
  • How to plan your project
  • How to include dependencies in your code, e.g. CLAP – command line argument parser
  • How to use crate with external libraries

Whole code and further explanation present in the article. If you are beginner in Rust this is well worth your time!

[Read More]

4 lessons for modern software developers from 1970s mainframe programming

Categories

Tags programming agile software-architecture

Alan Zeichick inspiring article about how current programmers should adopt several attitudes that early mainframe developers considered an essential part of their skill sets.

Eight megabytes of memory is plenty. Or so we believed back in the late 1970s. Our mainframe programs usually ran in 8 MB virtual machines (VMs) that had to contain the program, shared libraries, and working storage.

Revisit four lessons I learned while programming mainframes:

  • Minimize the cost of computation – especially for cloud computing
  • For data processing, think headless
  • Design and program for zero defects
  • It’s not about refactoring: optimize up front

Learn in detail, what author meant, by following the link below.

[Read More]

Introducing Twirp RPC framework for Golang

Categories

Tags apis web-development infosec

Spencer Nelson published article in which he introduced an RPC framework they use for communication between backend servers written in Golang. It’s called Twirp, and it’s available now under an Apache 2 open source license.

He claimed that Twirp was tremendously success at Twitch due to its advantages over “REST” APIs or gRPC, its two closest competitors.

Twirp is a structured RPC framework, but with an emphasis on simplicity. It works on HTTP 1.1, chooses stability and modularity over an expansive feature set, and then gets out of the way.

Structured RPCs are much easier to design and maintain than URL-oriented REST APIs, as they let you focus on business logic instead of routing schemes.

Learn more, including what is so great about Twirp, why it is better than gRCP, how easy it is start – in this excellent article with code examples. It works even on command line.

[Read More]

Practical guide for UX and Agile website development

Categories

Tags ux web-development

Jon-Mikel Bailey wrote this guide to UX and Agile website development. Many experts argue for a more agile approach with sprints based on user feedback and statistical website data. And author agrees that they’re not necessarily wrong.

User experience (UX) focuses on having a deep understanding of users, what they need, what they value, their abilities, and also their limitations

He shortly explains difference between Waterfall development and Agile development. Agile – also known as rapid development, relies heavily on stakeholder availability. But in fact, very few organizations have the budget or the type of project where Agile makes sense.

He then present arguments how to decide if UX Agile is right for you based on Website success of your organization, needs fo your users, what your users value etc.

[Read More]