To Start Kafka Server

.\bin\windows\kafka-server-start.bat .\config\server.properties
.\bin\windows\kafka-server-stop.bat .\config\server.properties

To Start Zookeeper Server

.\bin\windows\zookeeper-server-start.bat .\config\zookeeper.properties
.\bin\windows\zookeeper-server-stop.bat .\config\zookeeper.properties

Create Topic in Kafka with 5 Partition and Replication factor of 1

kafka-topics.bat  --bootstrap-server localhost:9092 --topic firsttopic --create --partitions 5  --replication-factor 1

Note: Replication Factor cannot be more than 1 incase of localhost.

List Topics

kafka-topics.bat  --bootstrap-server localhost:9092 --topic --list

Describe Topic

kafka-topics.bat  --bootstrap-server localhost:9092 --topic firsttopic --describe

Delete Topic

kafka-topics.bat  --bootstrap-server localhost:9092 --topic firsttopic --delete

Producer to push data into Topic in Kafka

kafka-console-producer.bat --broker-list localhost:9092 --topic test

Producer sending data into Topic as Key:Value pair

kafka-console-producer.bat --broker-list localhost:9092 --topic firsttopic  --property parse.key=true --property key.separator=:

Note:

  1. Kafka Topic with same key would end in same Partition
  2. separator should be sent in command to diff between key and value

If you try to push data to a topic which doesn’t exist after 3 attempts the topic would be created.

Consumer to pull data from Topic in Kafka

kafka-console-consumer.bat --topic test --bootstrap-server localhost:9092 --from-beginning

Print Partition, Key, Value in consumer

kafka-console-consumer.bat --topic thirdtopic --bootstrap-server localhost:9092  --formatter kafka.tools.DefaultMessageFormatter --property print.timestamp=true --property print.key=true --property print.value=true --property print.partition=true --from-beginning

Adding consumer to consumer Group

kafka-console-consumer --bootstrap-server localhost:9092 --topic third_topic --group my-first-application

Reset Offset in Topic in all partitions

kafka-console-consumer.bat --topic thirdtopic --bootstrap-server localhost:9092  --formatter kafka.tools.DefaultMessageFormatter --property print.timestamp=true --property print.key=true --property print.value=true --property print.partition=true --from-beginning

Note: Resetting Partition makes the consumer to read from the new offset point.

{“Message”: “Hello World from Kafka”}

How Topics, Partitions and Broker are related

Topics are logical categories or streams of data within Kafka. They act as message queues where producers publish data and consumers retrieve it
Brokers are servers that store and manage topics, and handle communication between producers and consumer
Partitions are the basic unit of data storage and distribution within Kafka topics. They are the main method of concurrency for topics, and are used to improve performance and scalability.

What is Broker Discovery?
A client that wants to send or receive messages from the Kafka cluster may connect to any broker in the cluster. Every broker in the cluster has metadata about all the other brokers and will help the client connect to them as well, and therefore any broker in the cluster is also called a bootstrap server.

  1. A client connects to a broker in the cluster
  2. The client sends a metadata request to the broker
  3. The broker responds with the cluster metadata, including a list of all brokers in the cluster
  4. The client can now connect to any broker in the cluster to produce or consume data

What is Replication Factor?
the number of copies of a topic’s partitions across different brokers. When Kafka Connects creates a topic, the replication factor should be at least 3 for a production system. A replication factor of 3 is commonly used because it balances broker loss and replication overhead

topic replication does not increase the consumer parallelism


How to choose the replication factor

It should be at least 2 and a maximum of 4. The recommended number is 3 as it provides the right balance between performance and fault tolerance, and usually cloud providers provide 3 data centers / availability zones to deploy to as part of a region.The advantage of having a higher replication factor is that it provides a better resilience of your system. If the replication factor is N, up to N-1 broker may fail without impacting availability if acks=0 or acks=1

The disadvantages of having a higher replication factor is Higher latency experienced by the producers, as the data needs to be replicated to all the replica brokers before an ack is returned if acks=all.More disk space required on your system

If there is a performance issue due to a higher replication factor, you should get a better broker instead of lowering the replication factor

Maximum Replication Factor = No of Brokers in Cluster

What is min.insync.replica?
min.insync.replicas is the minimum number of copies of the data that you are willing to have online at any time to continue running and accepting new incoming messages. min.insync.replica here is 1 by default

What is role of Zookeeper in kafka?

  1. Electing a controller. The controller is one of the brokers and is responsible for maintaining the leader/follower relationship for all the partitions. When a node shuts down, it is the controller that tells other replicas to become partition leaders to replace the partition leaders on the node that is going away. Zookeeper is used to elect a controller, make sure there is only one and elect a new one it if it crashes.
  2. Cluster membership – which brokers are alive and part of the cluster? this is also managed through ZooKeeper.
  3. Topic configuration – which topics exist, how many partitions each has, where are the replicas, who is the preferred leader, what configuration overrides are set for each topic
  4. (0.9.0) – Quotas – how much data is each client allowed to read and write
  5. (0.9.0) – ACLs – who is allowed to read and write to which topic (old high level consumer) – Which consumer groups exist, who are their members and what is the latest offset each group got from each partition.

What is bootstrap.servers?
bootstrap.servers provides the initial hosts that act as the starting point for a Kafka client to discover the full set of alive servers in the cluster. bootstrap.servers is a configuration we place within clients, which is a comma-separated list of host and port pairs that are the addresses of the Kafka brokers in a “bootstrap” Kafka cluster that a Kafka client connects to initially to bootstrap itself.

Since these servers are just used for the initial connection to discover the full cluster membership (which may change dynamically), this list does not have to contain the full set of servers (you may want more than one, though, in case a server is down).

It is the URL of one of the Kafka brokers which you give to fetch the initial metadata about your Kafka cluster. The metadata consists of the topics, their partitions, the leader brokers for those partitions etc. Depending upon this metadata your producer or consumer produces or consumes the data.

You can have multiple bootstrap-servers in your producer or consumer configuration. So that if one of the broker is not accessible, then it falls back to other.

Kafka default partitioner doesnt pitch in until the data reaches 16KB

What is Consumer Group?
If more than one consumer comes togeather and tries to read topic, in such case topic which is split across various partitions would be read by various consumer in group.

In Kafka, messages are always stored using key value format, with key being the one used for determining the partition after hashing and value the actual data.

During Writing(message creation), producers uses serializers to convert the messages to bytes format. Kafka employs different kind of serializer based on the datatype which needs to be converted to byte format
Consumers uses deserizliser at their end to convert bytes to original data. Pro

It also allows custom serializer which helps in converting data to byte stream.

How Consumer reads data
Consumer keeps track of data read by having Consumer Offsets. A consumer offset in Kafka is a unique integer that tracks the position of the last message a consumer has processed in a partition
in order to “checkpoint” how far a consumer has been reading into a topic partition, the consumer will regularly commit the latest processed message, also known as consumer offset.

Offsets are important for a number of reasons, including: Data continuity: Offsets allow consumers to resume processing from where they left off if the stream application fails or shuts down.
Sequential processing: Offsets enable Kafka to process data in a sequential and ordered manner. Replayability: Offsets allow for replayable data processing.

When a consumer group is first initialized, consumers usually start reading from the earliest or latest offset in each partition. Consumers commit the offsets of messages they have processed successfully.
The position of the last available message in a partition is called the log-end offset. Consumers can store processed offsets in local variables or in-memory data structures, and then commit them in bulk.
Consumers can use a commit API to gain full control over offsets.

  1. Ready the Sourcecode
  2. Upload Sourcecode to Azure Repo
  3. Create a build pipeline in azure – Creation of YAML Pipeline
  4. Creating service connection for the project
  5. Building release pipeline for deployment
  6. Compliancy check for build and release pipeline

There are two pipelines. Build and release pipeline. Build pipeline would be mostly one and there would be multiple release pipelines. There would be multiple config files appended to the release pipeline. These are basically Yaml files that are displayed in stages. In artifact there would be no details regarding the environment and DB config details. The Environment and config details are picked from the Stages which has multiple YAML file containing details of various envs and config which would be appended to the artifact at the time of deployment.

Creating a New Build Pipeline for Project

  1. Create a new repository and add readme.txt file which creates a master branch. Add simple spring boot project
  2. Create a new pipeline. While creating pipeline it asks to select repo.On Successful creation of pipeline new azure-pipeline.yml would be created and added as new file along with project file in repo.
  3. Make below changes in azure-pipeline.yml file(applicable for basic spring boot project)
    1. Step to Create Revision number mostly from environment variables
    2. Step to Build Spring boot project
    3. Step to Copy the JAR file and manifest.yml created at end of build
    4. Step to publish artifact and put in location drop

Creating a New Build Pipeline for Project

  1. From the Drop location the files would be picked by release pipeline. This is configured in manifest.yml.The name of the JAR created should be same as one specified in manifest or else it would complain as file not found error
  2. Release pipleine contains 2 things Artifact and Stages
  3. Artifact is the one copied from Build Pipeline. Azure Devops would be configured to pick the latest artifact from Branch
  4. The Trigger attached to Artifact tells from which branch the arifact should be copied and whether new release should be created
  5. Stages contains Jobs and Tasks. For running jobs we need agent. This is again configurable. By Default it would be set to some Ubuntu Linux agent
  6. The Artifact available in previous step now needs to be pushed in PCF, which would be done by creating new task. For this Clound Foundary endpoint and commands would be defined.Incase you are using PCF you can use Cloud Foundary CLI. In the arguments the location of the manifest.yml should be specified. Reading this manifest helps to locate the
    name of the JAR file which should be pushed into cloud environment. For the same reason we copy both JAR and Manifest in Step 3(3) in build pipeline. Now this would be picked from drop location
  7. There would be predeployment condition which checks for the availability of Artifact. This is again similar to trigger which runs checking for the availability of new release(artifact) for deployment
  1. Function is represented as Object in Javascript
  2. Has 2 phases – Function Definition and Function Execution
  3. Two ways of defining function
    Function Declaration / Named Function – Function object would get created at scope creation phase
    Function Expression / Anonymous Function – Function object would get created at execution phase – Interepreter would throw error incase the function is called before anonymous definition.

      
      //Named Function
     displayAge(); 
     
     function displayAge(){
      console.log('My Age is 33')
     } 
     
     //Anonymous Function 
     var age = function(){ //Context/scope execution phase
       console.log('My Age is 33')
     } 
     age();
    
  4. No concept of function overloading. Function with near matching argument would be called.In the below code getSum has 2 arguments but still it gets called.
    function getSum(num1, num2) {
      console.log('Function overloading is not possible');
    }	
    getSum();
    
      Function overloading is not possible
    
  5. Function namespace context would be created with the samename as the function namespace
  6. In the below code the getLunch appears to be overloaded but there would be only one namespace in context with name getLunch
  7. So you may expect the output to be different but all the times getLunch(buffey, paid) would be called in below code
    function getLunch() {
      console.log('Free Lunch');
    }
    
    function getLunch(paidLunch) {
      console.log('paidLunch');
    }
    
    function getLunch(buffey, paid) {
      console.log('paidLunch buffey');
    }
    getLunch();
    getLunch(5);
    getLunch(5,10);
    

    Output

    paidLunch buffey
    paidLunch buffey
    paidLunch buffey
    
  8. So what would be the workaround. Check the code as below
     
      function getLunch() {
      if(arguments.length === 0)
        console.log('Free Lunch');
      
      if(arguments.length === 1)
        console.log('paidLunch');
        
      if(arguments.length === 2)
        console.log('paidLunch buffey');
      }
      
       getLunch();
       getLunch(5);
       getLunch(5,10);
    

    Output

    Free Lunch
    paidLunch
    paidLunch buffey
    
  9. Using Restparameter feature from ECMAScript6
     
    function getLunch(bill, space, ...menu) {
      console.log(bill);
      console.log(space);
      console.log(menu);
    }
    
    getLunch(150, 'Open Terrace', 'idly', 'dosa', 'vada');
    

    Output

    150
    Open Terrace
    ["idly", "dosa", "vada"]
    
Posted in JS.

Everything about pH – Acidic or Alkaline

  1. pH is the measure of acidity or alkalinity of soil.pH varies between 1 to 14. 1 being most acidic and 14 being most alkaline. 6.5 to 7 is considered as neutral
  2. pH varies between 1 to 14. 1 being most acidic and 14 being most alkaline. 6.5 to 7 is considered as neutral
  3. Plants extract iron from the soil by roots. If the soil is alkaline irons bound to the soil.Depending on soil pH mineral bound to soil particle or make it soluble for uptake by plant
  4. Hydrogen ions are found at very low level. 0.0000001 Molar which is (log10 -7) pH7.pH is concentration of hydrogen ions. The more hydrogen ions are loosely available the lower the pH. The soil would be more acidic not alkaline.

Low the soil pH
Soil that is too acid (having a low Ph between 1.0 and 6.0) will show the following symptoms caused by increased availability of aluminum and a
decreased availability of phosphorus

  1. wilting leaves
  2. stunted growth of plant and/or root
  3. yellow spots on the leaves that turn brown and lead to leaf death
  4. blighted leaf tips
  5. poor stem development

High the soil pH
Soil that is too alkaline (having a high Ph between 8.0 and 14.0) will show the following symptoms caused by the plants inability to absorb iron. Phosphorus is
also not readily available and the micronutrients zinc, copper and manganese are also in limited supply.

  1. Interveinal chlorosis- (light green or yellowing of the leaf with green veining)
  2. General leaf discoloration

From the ph scale below, certain plants thrive in slightly acidic or slightly alkaline conditions. If you see your asparagus, cauliflower, lettuce, parsley
and spinach thriving you may have more alkaline conditions if your plants like radishes, sweet potatoes, peppers, and carrots are
struggling since they thrive in more acidic conditions and vice versa.

Chlorosis is a yellowing of leaf tissue due to a lack of chlorophyll. Possible causes of chlorosis include poor drainage, damaged roots,
compacted roots, high alkalinity, and nutrient deficiencies in the plant. Nutrient deficiencies may occur because there is an insufficient amount in the soil or because the nutrients are unavailable due to a high pH (alkaline soil). Or the nutrients may not be absorbed due to injured roots or poor root growth.

Chlorosis can be because of iron deficiency(called just chlorosis) or nitrogen deficiency(interveinal chlorosis)

Iron deficiency or Intervenial Chlorosis
Interveinal chlorosis is a yellowing of the leaves between the veins with the veins remaining green. . A lack of iron in the soil can cause interveinal chlorosis but so will a number of other soil issues. Just because you have a plant with interveinal chlorosis does not mean you have an iron deficiency. Each of the following conditions can produce the same symptoms. Use Iron sulfate around the plant. This will add iron, in case you do have a deficiency. It will also add sulfur which might help lower your soil pH. You can also try just agricultural sulfur which will lower the pH. When the pH goes down, plants have an easier time getting at the existing iron.

  1. a high soil pH or Soil is alkaline
  2. manganese deficiency
  3. compacted soil
  4. plant competition

Nitrogen deficiency or Chlorisis
Nitrogen taken up by plants is used in the formation of amino acids which is the building block for proteins. Nitrogen is a structural component of chlorophyll. Urea, ammonium nitrate, calcium ammonium nitrate are common nitrogen-based fertilizers being used. When a plant is suffering from Nitrogen Chlorosis the older leaves of the plant will turn yellow rather than
younger leaves since younger leaves have nitrogen readily available from roots and more absorbing capacity than older leaves. Using azospirillum helps in fixing nitrogen in the soil.

Cross-Origin Resource Sharing (CORS)
The browser’s same-origin policy blocks reading a resource from a different origin. This mechanism stops a malicious site from reading another site’s data. The same-origin policy tells the browser to block cross-origin requests. When you want to get a public resource from a different origin, the resource-providing server needs to tell the browser “This origin where the request is coming from can access my resource”. The browser remembers that and allows cross-origin resource sharing.

In angular when front end request origin is different the browser stops processing response from the server.

Request has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header
is present on the requested resource.

Same-Origin Policy

  1. The same-origin policy fights one of the most common cyber-attacks out there: cross-site request forgery.
  2. If you have logged into FB your info would be stored in Cookie and would be tagged along when the request is made every time
  3. Every time you re-visit the FB tab and click around the app, you don’t have to sign in again. Instead, the API will recognize the stored session cookie upon further HTTP requests.
    The only trouble is that the browser automatically includes any relevant cookies stored for a domain when another request is made to that exact domain.
  4. Say you clicked on a particularly trick popup ad, opening evil-site.com.The evil site also has the ability to send a request to FB.com/api. Since the request is going to the FB.com domain, the browser includes the relevant cookies. Evil-site sends the session cookie, and gains authenticated access to FB. Your account has been successfully hacked with a cross-site request forgery attack.
  5. At this point, browser will step in and prevent the malicious code from making an API request like this. It will stop evil-site and say “Blocked by the same-origin policy.

How Browser works underhood?

  1. The browser checks for the request origins of the web application and the Server origins response match
  2. The origin is the combination of the protocol, host, and port.
          For example, in https://www.FB.com, 
    	   the protocol is https://, 
    	   the host is www.FB.com, and 
    	   the hidden port number is 5400 (the port number typically used for https).
    
  3. To conduct the same-origin check, the browser accompanies all requests with a special request header
    that sends the domain information to receiving server
  4. For example, for an app running on localhost:3000, the special request format looks like this:
    Origin: http://localhost:3000
    

    Reacting to this special request, the server sends back a response header. This header contains an Access-Control-Allow-Origin key,
    to specify which origins can access the server’s resources. The key will have one of two values:

    One: the server can be really strict, and specify that only one origin can access it:
    Access-Control-Allow-Origin: http://localhost:3000

    Two: the server can let the gates go wide open, and specify the wildcard value to allow all domains to access its resources:
    Access-Control-Allow-Origin: *

  5. Once the browser receives this header information back, it compares the frontend domain with the Access-Control-Allow-Origin
    value from the server. If the frontend domain does not match the value, the browser raises the red flag and blocks the API
    request with the CORS policy error.

The above solution works for development. How about in production.

To address such issues, the proxy is used between client and server.

Request from Client -> Proxy Server -> Server 
Respose from Server -> Proxy Server(appends origin) -> Client

Now what the proxy does is it appends the s Access-Control-Allow-Origin: * in the header before the response is sent to the client browser

Symmetric Key Encryption (Private key Encryption)

  1. Same key is used between client and server to encryt and decrypt message
  2. Copy of key exist at both ends
  3. First time the copy of key generated should be sent securely to otherside.
  4. Public Key Encryption(Asymmetric Encrytion) is used to get the copy of the symmetric key for the first time
  5. Thoughts may arise if I could share key securely for the first time why don’t I use the same methodology but it is resource-intensive
  6. Advantage is The Encrytion and Decrytion is faster compared to Asymmetric Key Encryption
  7. Disadvantage is Key needs to be transferred for the first time and Key should be stored securely

Asymmetric Key Encryption (Public key Encryption)

  1. Uses Public and Private Key
  2. Encrypted with one key and decrypted with other key. The Client uses public key to Encryt and Server uses private key to decrypt.
  3. Public key would be shared and to recieve encrypted message from client by public key
  4. This similar to Safe(Public Key) and Key(Private Key), When you send data it would be encrypted using public key similar to
    safe which doesnot needs a key to lock. The Private key in server could unlock using the key it holds.

Man-In-Middle-Attack

  1. Man in middle generates his own public key which is available to client
  2. Client used public key provided by man in middle and sends his data
  3. Man in middle decrypts using his private key and makes a genuine request by encryting public key to server
  4. To address this issue certificates were used

Certificates

  1. The main purpose of the digital certificate is to ensure that the public key contained in the certificate belongs to the entity to which the
    certificate was issued, in other words, to verify that a person sending a message is who he or she claims to be, and to then provide the message
    receiver with the means to encode a reply back to the sender.
  2. This certificate could be cross checked and confirmed with certificate authoritiy

Certificate Authoritiy(CA)

  1. A CERTIFICATE AUTHORITY (CA) is a trusted entity that issues digital certificates, which are data files used to cryptographically link
    an entity with a public key. Certificate authorities are a critical part of the internet’s public key infrastructure (PKI) because
    they issue the Secure Sockets Layer (SSL) certificates that web browsers use to authenticate content sent from web servers.
  2. The role of the certificate authority is to bind a public key of Server to a name which could be verified by browser to make sure the response is from genuine server
    Certificate Authority validates the identity of the certificate owner. The role of CA is trust.
  3. Certificates must contain Public Key which could be cross-checked with Certificate Authority(CA)
  4. CA would be mostly big companies like Symantec, google which acts as thirdparty to reassure trust.
  5. Self-Signed Certificate where you uses your own server and client to generate certificate. CA doesnot comes in play in Self-Signed Certificate
    The above method may open door to man in middle attack
  6. Root Certificate is something which you would get when you use Self-Signed Certificate with your custom CA. Root Certificate would be available in
    all client system which access data with server

Communication over HTTPS(HTTP over Secure Socket Layer)

  1. SSL is web servers digital certificate offered by third party.Third party verifies the identity of the web server and its public key
  2. When you make a request to HTTPS website, the sites server sends a public key which is digitally signed certificate by third party or
    Certificate Authority(CA)
  3. On receiving the certificate the browser sends the Certificate with public key to third party to check whether the certificate is valid
  4. After verifiying the certificate the browser creates a 2 symmetric keys, one is kept for browser and other for server. The key is sent by
    encrypting using webservers public key. This encryted symmetric key is sent to server
  5. Web server uses its private key to decrypt. Now the communication happens using shared symetric key.

Typically, an applicant for a digital certificate will generate a key pair consisting of a private key and a public key, along with a certificate signing request (CSR)(Step1). A CSR is an encoded text file that includes the public key and other information that will be included in the certificate (e.g. domain name, organization, email address, etc.). Key pair and CSR generation are usually done on the server or workstation where the certificate will be installed, and the type of information included in the CSR varies depending on the validation level and intended use of the certificate. Unlike the public key, the applicant’s private key is kept secure and should never be shown to the CA (or anyone else).

After generating the CSR, the applicant sends it to a CA(Step2), who independently verifies that the information it contains is correct(Step3) and, if so, digitally signs the certificate with an issuing private key and sends it to the applicant.

When the signed certificate is presented to a third party (such as when that person accesses the certificate holder’s website), the recipient can cryptographically confirm the CA’s digital signature via the CA’s public key. Additionally, the recipient can use the certificate to confirm that signed content was sent by someone in possession of the corresponding private key, and that the information has not been altered since it was signed.

KeyStore and TrustStore

  1. Technically a KeyStore and a TrustStore are of same. They just serve different purposes based on what they contain.
  2. A KeyStore is simply a database or repository or a collection of Certificates or Secret Keys or key pairs. When a KeyStore contains only certificates, you call it a TrustStore.
  3. When you also have Private Keys associated with their corresponding Certificate chain (Key Pair or asymmetric keys), it is called a KeyStore.
  4. Your truststore will be in your JAVA_HOME—> JRE –>lib—> security–> cacerts
  5. ‘cacerts’ is a truststore. A trust store is used to authenticate peers. A keystore is used to authenticate yourself in mutual authentication
  6. cacerts is where Java stores public certificates of root CAs. Java uses cacerts to authenticate the servers.
    Keystore is where Java stores the private keys of the clients so that it can share it to the server when the server requests client authentication.
  7. Keystore is used to store private key and identity certificates that a specific program should present to both parties (server or client) for verification.
    Truststore is used to store certificates from Certified Authorities (CA) that verify the certificate presented by the server in SSL connection.
  8. Mutual authentication requires Keystore and Truststore whereas Server-Client authentication requires truststore to store Certificates from CA.


List the content of your keystore file

keytool -v -list -keystore .keystore

specific alias, you can also specify it in the command

keytool -list -keystore .keystore -alias foo

Importing Certificate to Truststore

keytool -import -trustcacerts -keystore $JAVA_HOME/jre/lib/security/cacerts -storepass changeit -alias Root -import -file Trustedcaroot.txt