Welcome to curated list of handpicked free online resources related to IT, cloud, Big Data, programming languages, Devops. Fresh news and community maintained list of links updated daily. Like what you see? [ Join our newsletter ]

Calculating the cost of software quality in your organization

Categories

Tags management agile teams miscellaneous software

An article by By Herb Krasner on interesting topic of the cost of software quality. Meeting the customer’s expectations at a high degree of conformance is no longer expected to come at a premium –- it is just expected.

Author turns his attention to what you, as a leader in your organization, can do about it. Calculating the cost of software quality is an important first step in identifying areas of opportunity to add value from IT while reducing costs, accelerating deliveries and remaining efficient/competitive.

The sections in this article:

  • Cost of good software quality
    • Prevention
    • Appraisals
    • Management control costs
  • Cost of poor quality
    • Internal failure costs
    • External failure costs
    • Technical debt
    • Management failures
  • Strategies for COSQ measurement and improvements
  • Example of what can be accomplished

Understanding Cost of Poor Software Quality in your organization is the first step toward gaining executive buy-in for quality-led operations. You will also find Cost of Software Quality Model in the article. Excellent read!

[Read More]

Blockchain No-Brainer: Ownership in the digital era

Categories

Tags crypto blockchain fintech

Dominic Perini article about his view on digital asset ownership, provenance and handling. In order to understand how the notion of ownership is currently perceived in society, author proposes to briefly analyse the journey that has brought us to the present stage and the factors which have contributed to the evolution of our perceptions.

Historically people have been predominantly inclined to own and trade physical objects. This is probably best explained by the fact that physical objects stimulate our senses and don’t require the capacity to abstract, as opposed to services for instance. Ownership was usually synonymous with possession.

The article covers:

  • How we value digital vs. physical assets
  • The evolution of services and their automation
  • Sustainability and access to resources
  • The generative approach
  • What prevents mass adoption of digital goods

Good post on major emerging innovation that blockchain technology has influenced dramatically over the last two years – the ownership of digital assets.

Privacy and sharing are also areas heavily debated. Owners of digital assets often prefer their identity to remain anonymous, while the benefit of socially shared information is widely recognised. Well done!

[Read More]

Comparison of 3 programming languages for a full-fledged next-generation sequencing tool

Categories

Tags programming java golang performance

Study done by Pascal Costanza, Charlotte Herzeel and Wilfried Verachtert for new implementation language for elPrep. elPrep is an established multi-threaded framework for preparing SAM and BAM files in sequencing pipelines. To achieve good performance, its software architecture makes only a single pass through a SAM/BAM file for multiple preparation steps, and keeps sequencing data as much as possible in main memory.

The sequence alignment/map format (SAM/BAM) is the de facto standard in the bioinformatics community for storing mapped sequencing data.

In most programming languages, there exist more or less similar ways to explicitly or implicitly allocate memory for heap objects which, unlike stack values, are not bound to the lifetimes of function or method invocations. However, programming languages strongly differ in how memory for heap objects is subsequently deallocated.

The article then describes typical preparation pipeline steps using elPrep’s software architecture in the three selected programming languages:

  • Sorting reads for coordinate order.
  • Removing unmapped reads
  • Marking duplicate reads
  • Replacing read groups
  • Reordering and filtering the sequence dictionary

The Go implementation performs best, yielding the best balance between runtime performance and memory use. While the Java benchmarks report a somewhat faster runtime than the Go benchmarks, the memory use of the Java runs is significantly higher. The C++17 benchmarks run significantly slower than both Go and Java, while using somewhat more memory than the Go runs.

Result charts and detailed benchmark process also described. How exciting!

[Read More]

Distributed systems with RabbitMQ

Categories

Tags devops erlang functional-programming distributed

In this article we’re going to talk about the benefits of distributed systems and how to move to distributed systems using RabbitMQ. Then we will learn the fundamentals of RabbitMQ and how to interact with it using Python. Written by Denis Orehovsky.

The article has this sections:

  • Distributed systems
  • RabbitMQ essentials
  • Working with RabbitMQ using Python

Extensive code examples provided. Also schemas explaining following exchange types: Fanout, Direct, Topic, Header.

Using RabbitMQ as a message broker is a great choice. We’ve learnt the fundamentals of RabbitMQ and how to interact with it using Pika library but in real world you will probably use a library like Celery instead of Pika. Nice work!

[Read More]

How we use Apache Kafka and the Confluent Platform

Categories

Tags blockchain apache apis data-science scala

Jendrik Poloczek from TokenAnalyst published this article about their experience building the core infrastructure to integrate, clean, and analyze blockchain data.

Apache Kafka® is the central data hub of TokenAnalyst. They’re using Kafka for ingestion of blockchain data. The Confluent Platform is a stream data platform that enables you to organize and manage data from many different sources with one reliable, high performance system.

A public ledger could potentially serve not only as a publicly accessible ledger for money or asset transactions but also as a ledger of interactions on a shared decentralized data infrastructure.

The blockchain as a data structure is, in essence, a giant, shared immutable log, lending itself perfectly for event sourcing and (replayed) stream processing. The required trust comes from transparency. And transparency is realized by surfacing and decoding the data that is stored on the blockchain.

In the article you learn:

  • Why does on-chain data matter?
  • Cluster of Ethereum nodes, Ethereum-to-Kafka bridge
  • Block confirmer based on Kafka Streams
  • API and software development kit (SDK)

To find out how they use templates written in Terraform, which allow them easily deploy and bootstrap nodes across the planet in different AWS regions. Together with use the Geth and Parity clients.

To bridge the gap between different Ethereum clients and Kafka, they developed an in-house solution named Ethsync, written in Scala. Good read!

[Read More]

Finding the cheapest flights for a multi-leg trip with Amadeus API and Python

Categories

Tags python machine-learning programming software

Vladimir Iakovlev is the author of this tutorial about finding cheapest flights for a multi-leg trip with Amadeus API and Python. Amadeus Travel APIs connect you to the richest information in the travel industry.

The tutorial is split into:

  • Restrictions
  • Airports
  • Prices and dates
  • Itinerary

To understand the complexity of the problem better author draw a graph of possible flights routes.

To find out how author was able to get cheapest flights with the minimal duration and the resulting prices were almost the same as on Google Flights click on the link to original source. Jupyter notebook is also available with the whole adventure. Well done!

[Read More]

Small & fast Docker images using GraalVM's native-image

Categories

Tags java scala docker programming devops

An article by Adam Warski focusing on optimization of deployment images. The JVM ecosystem has a lot of great traits, but small, cloud-deployment-friendly Docker images is not one of them.

Dockerizing even a simple application will result in an image that has hundreds of megabytes, as it needs to contain all dependencies (jars) and the whole JVM. Can we do anything about it?

The article then dives into:

  • GraalVM and native-image utility
  • Compile the Scala (Java/Kotlin/…) application to .jar files
  • Use native-image to generate a native binary
  • Create a docker image with the generated native binary
  • An example using Scala + sbt

The author reduced the size of the docker image from 647MB to 26.5MB. Example code is also included. Great guide!

[Read More]

Indexes bad practices

Categories

Tags database sql miscellaneous linux

In this article author will discuss secondary indexes common mistakes and bad practices. While the article is focused on MySQL and the InnoDB storage engine, most of this information applies to any DBMS. Published by Federico Razzoli.

An index is an ordered data structure.

Common mistakes and bad practices

  • Duplicate indexes
  • Indexes and the primary key
  • Expecting the impossible
  • Columns order
  • UNIQUE indexes
  • Index names

Indexes are essential for query performance. And fast queries are essential for application performance. Interesting read!

[Read More]

Exposing MyRocks internals via system variables: data writing

Categories

Tags database sql miscellaneous

Peter Sylvester wrote this blog post about new storage engine for MySQL that’s rising in popularity. MyRocks, the log-structured merge-driven RocksDB engine in MySQL.

The content is split into:

  • Rows vs. key-value
  • Column families
  • System variables vs. column family options
  • In-Memory data writes
  • Variables and CF_OPTIONS
  • Associated Metrics

In this post, you get column families explained, the differences between server variables and column family options, how data gets written into memory space, the associated logging, and different conditions that designate when that data will be flushed to disk. Good read!

[Read More]

How to remove unused CSS from your website

Categories

Tags how-to css programming performance

An article by Dan Englishby describing how to go about clearing CSS files. A cascading style sheet can gradually build up into a bulky file over time. This means two things, your CSS file is messy, and it’s unnecessarily bigger.

A bigger CSS file indicates more significant download times, and we don’t want that if it’s not necessary!

The article contains:

  • Prerequisites
  • Understanding how PurifyCSS works
  • Installing PurifyCSS
  • Prepping our files
  • Creating the JS Purifier Script
  • Purifying

In this article the author used PurifyCSS to shed 13kb from example CSS file, which was a 70% reduction. Nice work!

[Read More]