Welcome to curated list of handpicked free online resources related to IT, cloud, Big Data, programming languages, Devops. Fresh news and community maintained list of links updated daily. Like what you see? [ Join our newsletter ]

Benchmarking time series workloads on Apache Kudu using TSBS

Categories

Tags analytics big-data data-science performance devops

Since the open-source introduction of Apache Kudu in 2015, it has billed itself as storage for fast analytics on fast data. This general mission encompasses many different workloads, but one of the fastest-growing use cases is that of time-series analytics. By Todd Lipcon.

In this blog post, we’ll evaluate Kudu against three other storage systems using the Time Series Benchmark Suite (TSBS), an open-source collection of data and query generation tools representing an IT operations time-series workload.

The article then covers:

  • Kudu-TSDB architecture
  • Benchmarking target systems
  • Benchmark hardware
  • Benchmark setup
  • Results: Data loading performance
  • Results: Light queries, 8 client threads
  • Results: Light queries, 16 client threads
  • Performance on heavy queries

Although Apache Kudu is a general purpose store, its focus on fast analytics for fast data make it a great fit for time series workloads. In addition to the quantitative differences summarized above, it’s important to understand qualitative differences between the stores. In particular, Kudu and ClickHouse share the trait of being general-purpose stores, whereas VictoriaMetrics and InfluxQL are limited to time series applications. In practical terms, this means that Kudu and ClickHouse allow your time series data to be analyzed alongside other relational data in your warehouse, and to be analyzed using alternative tools such as Apache Spark, Apache Impala, Apache Flink, or Python Pandas. Good read!

[Read More]

Cloud platform teams are everywhere — here's why

Categories

Tags cloud management infosec teams devops

In HashiCorp’s new State of Cloud Strategy survey, 86% of respondents said they rely on cloud platform teams — for a wide variety of very good reasons. Organizations with complex business requirements have long sought ways to simplify operations and boost the productivity of their software development teams. It appears business and IT leaders have found an answer: adopt and empower centralized cloud platform teams. By Jared Ruckle.

Main article points:

  • What’s powering the rise of platform teams?
  • Different cloud service providers have different APIs
  • Skills shortages
  • Cultural transformation is siloed and uneven
  • Governance becomes difficult to manage
  • Business leaders trust platform teams – and expect a lot from them
  • Cloud platform teams require continued investment

In our experience working with many of the world’s largest brands, platform teams typically include engineers who provision, run, and manage cloud infrastructure and other shared services. These teams create and operate highly automated platforms available on-demand across the organization. Developers can access the platform capabilities via self-service processes, making it easy to quickly create new environments and new service instances.

But platforms are never finished, just shipped. There’s always more to be done. The Forrester Consulting study (Unlocking Multicloud’s Operational Potential) suggests platform teams are critical “to mitigate people – and process-themed challenges like skills shortages (41%) and siloed teams (35%).” Good read!

[Read More]

How to build an organizational culture that is 'cybersecurity ready'

Categories

Tags cio management infosec teams frameworks

Cyber threats are some of the biggest challenges organizations face, but cybersecurity failure is still seen as a critical short-term risk.. By Artem Nikulchenko. By Candid Wüest, Nisha Almoula, Roman Hagen @weforum.org.

Cyber risk is one of the main challenges that organizations face today. The World Economic Forum’s Global Risks Report 2022 highlights how cyber threats have intensified through digital transformation and growing digital dependency.

The article then walks you through:

  • 80% of firms have suffered a cybersecurity breach
  • Boards should prioritize cyber risks in planning
  • Strategic involvement is vital to secure assets and services
  • Cross-functional coordination can strengthen response capabilities
  • Collaboration is key to being ‘cybersecurity ready'

Most executives and board members are aware of key global cyber threats and recognize cybersecurity risk as an enterprise-wide risk, but not everyone understands the impact of these cyber risks and their economic drivers.

Cybersecurity must be a core strategic priority, and ownership and accountability for cybersecurity risk management activities must be adopted both within and outside the CISO organization. Nice one!

[Read More]

Steps to emulate k8s Pod Network

Categories

Tags cloud cio kubernetes containers devops gcp

Networking is the spine of Kubernetes, but it can be challenging to understand exactly how it is expected to work. There are 4 distinct networking problems to address. By Harinderjit Singh.

There are multiple ways to achieve the requirements laid by Kubernetes for pod networking. We can mainly differentiate between them on the basis of whether the pod network address space is part of the node pool’s subnet or the Pod network address space is separate and is not part of the node pool’s subnet. We will try to emulate the latter.

The article then explains:

  • Pod network
  • Test Configuration
  • Emulation of pod network
  • Testing the connectivity

Linux namespaces (particularly network namespaces) make it easy to implement these requirements. A network namespace is assigned to a pod as soon as it is scheduled and it is done by Kubelet. That means one network namespace for each pod. Good read!

[Read More]

Hidden gems of Google BigQuery

Categories

Tags golang app-development database miscellaneous gcp

BigQuery is amazing. It is one of my favorite tools within Google Cloud. Luckily, it looks like Google feels the same and, to the joy of BigQuery fans, keeps adding new features there. By Artem Nikulchenko.

Let’s say you push some data into BigQuery, and then another system wants to run a scheduled job to process the newly arrived data. For example, a system can try to pull data from BigQuery to another storage, or this system needs to run hourly reports based on the data, etc. In each of those cases, you would prefer to avoid processing the same records multiple times . As a result, you need a way to know which records are already processed and which were added after the processing took place.

No matter how long I have been working with BigQuery, there is always something new I discover once in a while. Today author wants to share with you the following four things:

  • AUTO column
  • Multi-statement transactions
  • Clustering
  • Indexes

As you may guess from the name, it is designed for point lookups, but not over any field. Currently, indexes can be used to easily find unique data elements that are buried in unstructured text or semi-structured JSON data. Indexes are only used when the SEARCH query is executed. Good read!

[Read More]

Shaving 40% off Google's B-Tree implementation with Go Generics

Categories

Tags golang app-development performance programming

There are many reasons to be excited about generics in Go. In this article, I’m going to show how, using Go generics, ScyllaDB achieved a 40% performance gain in an already well-optimized package, the Google B-Tree implementation. By Michal Matcczuk.

The work covered in this article was part of ScyllaDB’s long-standing partnership with the Computer Science Department at the University of Warsaw. We’ve worked on a number of projects together: integrating Parquet, an async userspace filesystem, a Kafka client for Seastar, a system for linear algebra in ScyllaDB and a design for a new Rust driver.

The article then describes and explains well:

  • Making faster B-Trees with Generics
  • The additional allocation
  • Why is it faster?

By shifting from an implementation using interfaces to one using generics, we were able to significantly improve performance, minimize garbage collection time, and minimize CPU and other resource utilization, such as heap size. Particularly with heap size, we were able to reduce HeapObjects by 99.53%. You will find all the code and performance testing described in the article as well. Very interesting!

[Read More]

Building open source search app with Appwrite and Meilisearch

Categories

Tags search app-development cloud open-source

Imagine you are building an inventory or an ecommerce website or a social media of your own, what are the two major things you will have to deal with? A. Database B. Search engine. By Haimantika Mitra.

This article deals with building a simple app where we search for movies, to achieve this we use:

  • Meilisearch- for building the search and
  • Appwrite - for the backend to store data and run functions to automate the search
  • Prerequisites
  • Creating an Appwrite function
  • Configuring the Appwrite function
  • Building the search app
  • Fetching data from Appwrite to MeiliSearch

Appwrite is a self-hosted backend-as-a-service platform that provides developers with all the core APIs required to build any application. Appwrite provides you with a set of APIs, tools, and a management console UI to help you build your apps a lot faster and in a much more secure way. Meilisearch is a powerful, fast, open-source, easy to use and deploy search engine. Both searching and indexing are highly customizable. Features such as typo-tolerance, filters, and synonyms are provided out-of-the-box. Good read!

[Read More]

How to be an architect?

Categories

Tags management teams career miscellaneous cio agile

But it is somewhat agreed understanding that Architects in a technical team are very senior people who should know everything. This is not a complete truth and nor a lie. What architects are supposed to do (I believe) still varies from organisation to organisation. By Husyn for AWS Community Builders.

The core skill for this role I believe is critical thinking and decision making. For any problem statement different roles in software engineering domain will have different solutions based on their perspectives. An Architect is the one which should understand all such perspectives and take decision based on first principles.

The article walks you over:

  • Types and kinds of architects
  • First principles & decision making
  • Soft skills
  • Business vs tech
  • Biases & preferences
  • Writing
  • Defining & naming
  • Architects to follow
  • Cloud architect

… and more. Any “good” cloud architect can’t be good in one area and be ignorant in other. They have to be a full package for the organisation. In terms of real-life example, AWS architects have a proper T shaped skillset. Yes there are specialised SAs, but even they understand basics of other domain. Interesting!

[Read More]

Micro-frontend with React and Next.js

Categories

Tags frontend react app-development web-development nodejs

Working on a large-scale project and managing its codebase can be a big challenge for teams. Though micro-frontends have been in the picture for a while now, they are getting increasingly popular because of their unique features and usability. By Harsh Pate.

Micro-frontends are particularly helpful because multiple teams can work on individual modules of the same project without worrying about other modules. With micro-frontends, it doesn’t matter how many modules will be added to a current system.

In this article, we’ll cover the basics of what a micro-frontend is and how to implement it using Next.js. We’ll also discuss the advantages of using micro-frontends in your applications.

  • Introduction to micro-frontends
  • Implementing a micro-frontend with Next.js
    • Prerequisites
    • Setting up the micro-frontends
    • Execution and results
  • Advantages of micro-frontends
    • Deployment and security
    • Scalability
    • Faster development
    • Easy testing

So how small is a micro-frontend? This is still unanswered. The bottom line is that you should split your project up so that the user experience won’t be disturbed. This process may be painful because it’ll likely include multiple whiteboard revisions. Godo read!

[Read More]

This is how we scale: Using a centralized logging solution

Categories

Tags devops aws app-development software-architecture serverless monitoring

Our current centralized logging solution is Logz.io: Cloud Observability for Engineers and most of our application logs are sent there from the k8s cluster. In addition, we use the logging system’s alert mechanism to trigger and send alerts to various sources, including email, Slack channels, etc. By Uriah Ahrak.

When we have several dozens of Lambda functions, we need to solve the problem in a generic way to reduce the amount of work and migration required by the developers who work with FaaS technology on a daily basis.

The article has these main parts:

  • Possible solutions
  • How do we plug in the currently deployed Lambda functions?
  • Deployment of logs consumer
  • Deployment of logs extension
  • Logging code-based Lambdas
  • Logging via CloudWatch log streams

For the final solution, we chose to have a segregation between our code and the logging system specifics. We initiated a Kinesis Stream, which is a generic process that anyone with AWS SDK and appropriate AuthZ can access and send logs in a specific format. If you have Lambda functions/applications that don’t support Lambda extensions, and you don’t want to change the code of the application in order to send logs towards the Kinesis Stream, you can use the CloudWatch Logger Lambda, which is listening to CloudWatch logs streams and pushing these logs automatically to the Kinesis Stream. Good read!

[Read More]