Saturday 22 August 2015

CQ Theory

CQ5 - What’s this tool all about? 
CQ5 is a java based content management system from adobe, previously Day CQ5.
1)    It is based on a content repository (i.e. it uses a content repository to store the content of a website) and use              JCR (java content repository) specification to access the content repository.
2)    It uses Restful Apache Sling framework to map request url to the corresponding node in content repository
3)    It uses powerful OSGi framework internally to allow modular application development. It means individual pieces of your application (called bundles in terms of OSGi) can be independently started and stopped. 
CQ5 uses Apache Felix as the OSGi container. Therefore different parts of cq5 can be independently started and stopped. 

Need of CMS
Some websites are very dynamic in nature, content needs to be updated frequently, so it is easier to manage the content of such websites using a CMS. Adobe CQ5 platform allows you to build compelling content-centric applications that combine Web Content Management, Workflow Management, Digital Asset Management and Social Collaboration. The product has been completely redesigned from Communiqué 4, allowing Adobe to use new architecture and technologies, thus increasing functionality while reducing complexity. Extensive use of standards helps ensure long-term stability.

Technology Stack
Cq5 uses the following technologies:
1)    JCR – java specification for accessing a content repository JSR-283 specification jcr 2.0, cq5 uses its own implementation of jcr called CRX. Apache Jackrabbit is an open-source implementation of jcr 2.0 specification.
2)    Apache Sling – Restful framework to access a jcr over http protocol. It maps the request url to the node in jcr.
3)    OSGi – framework for modular application development using java. Each module called bundle can be independently started and stopped.


Content Repository (E.g.: JCR)
A Content repository is basically a place where digital content is stored. Generally the structure of the content repository is hierarchical and represented as a tree structure where each node of the tree is used to store content.
Java Content Repository is a specification provided by the java community to access the content repository in a uniform way (platform independent and vendor independent way). The specification was initially released as JSR-170(JCR 1.0) and then later revised version 2 as (JCR-283).
The javax.jcr API provides the various classes and interfaces to access a content repository
Apache Jackrabbit is an open-source implementation of JCR- 2.0 specification.
It provides some wrapper classes and interfaces specific to jackrabbit plus many more functionalities on top of jcr.
The org.apache.jackrabbit packages are used to access Jackrabbit.

Sling (web-development framework)
Apache Sling is RESTful framework to access a java content repository over http protocol.
It is a content driven framework that is it maps the incoming user request based on URI to the corresponding node in the content repository and depending on the type of the request (GET,POST, etc) executes the corresponding dynamic script.
For example - consider a scenario where a user is hitting a US website products page and getting the details of product1.
The incoming URL request from user will be
www.mywebsite.com/products/product1.html
This would be mapped by the sling resource resolver to a node in the JCR
/content/mywebsite/us/products/product1
Now the Sling resource resolver will check the properties of this node and check the sling:resourceType property of this node, which will tell the script that will be used to render the content of this page.
For example if value of sling:resourceType property is /apps/mywebsite/products/GET/body.jsp
Then for all the incoming GET requests on the product1 node, the above script will be used to render the content.
The main advantages of sling are:
1) it maps the url directly to the content
2) it is Restful.

OSGi and its benefit
OSGi is a framework which allows modular development of applications using java.
A large application can be constructed using     small reusable components(called bundles in terms of OSGi) each of which can be independently started, stopped, and also can be configured dynamically while running without requiring a restart.
Consider a scenario where you have a large application which uses a logging framework. This logging framework can be deployed as an OSGi Bundle, which can be managed independently. Therefore, it can be started when required by our application and can be stopped when not in use. Also the OSGi container makes these bundles available as services, which can be subscribed by other parts of application.
The main advantages of using OSGi :
1)    Reduces the complexity of the system.
2)    Makes the components loosely couples and easy to manage.
3)    Increases the performance of the system, since parts of application which are not in use, need not to be loaded in the memory(although there is not a drastic change in performance and also some people argue that running an OSGi container itself takes huge memory).

Dispatcher Role 
Dispatcher is a CQ5 tool for caching and load-balancing. It has 2 responsibilities.
1)    Caching – To cache as much content as possible, so that it doesn’t need to access layout engine frequently for generating content dynamically.
2)    Load-balancing – To increase the performance by load-balancing.
Dispatcher uses 2 main strategies for caching.
1)    Cache as much content as possible as static pages.
2)    Accessing layout engine as little as possible.
Note: The Dispatcher uses a Cache Directory for caching static content. The cached documents are created in the root of a web-server.
Dispatcher uses the following methods for caching:
1)    Dispatcher invalidates those pages whose content has been updated and replaces it with new content.
2)    Auto-Invalidation automatically removes the content parts which are out of date.
Coming to load-balancing, if there are multiple cq instances configured with a dispatcher, the dispatcher can do a load-balancing and if there is too much load on any cq instance, it can relay the request to another less busy instance.
How Dispatcher performs Load-balancing
1)    Performance Statistics – Dispatcher keeps statistics on how fast each instance of cq is responding to a particular url. Based on those metrics, dispatcher determines which instance of cq will fetch the quickest response for any request and relays the request to that cq instance.
2)    Sticky Connections – when a user session is established, then all incoming requests from that user should be served by the same cq instance, because other cq instances cannot recognize the user session and generate personalized pages for him. Dispatcher makes sure all requests for user session are served from the same cq instance.
Dispatcher caching
The Dispatcher uses the web server's ability to serve static content. The Dispatcher stores cached documents in the web server’s document root. The Dispatcher has two primary methods for updating the cache content when changes are made to the website.
ñContent Updates remove the pages that have changed, as well as files that are directly associated with them.

ÑAuto-Invalidation automatically invalidates those parts of the cache that may be out of date after an update. For example, it effectively flags relevant pages as being out of date, without deleting anything.

How Content moved from author to publish instance
The content once published is moved from author instance to publish instance using replication agent.
Replication agent has following responsibilities:
1.       Move content from author to publish instance.
2.       If any content is updated, then it flushes the older content from dispatcher cache.
A development environment can contain multiple cq-author and multiple cq-publish instances, therefore an author instance can be configured to have many replication agents. Each of which will replicate the content in 1 or more publish instances.
When a request is made from user to publish any content, then the replication agent packages the content and places it in a replication queue.
A Listener servlet in publish instance receives the content package and updates the content in publish instance. The default listener servlet in publish instance is “http://localhost:4503/bin/receive”.

 What is Replication in CQ5?
1)    Publish (activate) content from author to publish environment.
2)    Explicitly flush content from the dispatcher cache.
3)    Return user input from the publish environment to the author environment.
REPLICATION PROCESS:
1)    First, the author requests that certain content to be published (activated).
2)    The request is passed to the appropriate default replication agent.
3)    Replication agent packages the content and places it in the replication queue.
4)    the content is lifted from the queue and transported to the publish environment using the   configured protocol.
5)    a servlet in the publish environment receives the request and publishes the received content, the default servlet is http://localhost:4502/bin/receive.



How is content moved from publish instance to author instance?
Consider the scenario where your website is having a blog or a forum, and the users are posting comments in the blog. Then that comments will only be in publish instance. The content is moved from publish instance to author instance using reverse replication and the job is done by reverse replication agent.
The reverse replication agent places any content updates in an outbox configured in publish instance. The replication listeners in author environment keep listening to the publish outbox and whenever any content is placed in publish outbox, the listeners update the content in author instance

Persistence manager in CQ
CQ uses persistence manager to save the content to a persistent storage like file system or a database.
By default the crx content is stored using Tar Persistence manager. It stores the content to file-system in standard Linux archive files called tar.
if you want to store the repository content in a database, then you can configure cq5 to use a database persistence manager.

REST and RESTful Framework
REST stands for Representational State Transfer.REST-style architectures consist of clients and servers. Clients initiate requests to servers; servers process requests and return appropriate responses. Requests and responses are built around the transfer of representations of resources. A resource can be essentially any coherent and meaningful concept that may be addressed. A representation of a resource is typically a document that captures the current or intended state of a resource.Apache Sling is RESTful framework to access a java content repository over http protocol.
REST stands for Representational State Transfer.
According to the REST architecture, there are some constraints which should be followed:
Client-Server
There should be a clear separation between a client and server.
Stateless
The client-Server communication is further constrained by no client context being stored on the server between requests. Each request from any client contains all of the information necessary to service the request, and any session state is held in the client. The server can be stateful
Cacheable
As on the World Wide Web, clients can cache responses. Responses must therefore, implicitly or explicitly, define themselves as cacheable, or not, to prevent clients reusing stale or inappropriate data in response to further requests. Well-managed caching partially or completely eliminates some client–server interactions, further improving scalability and performance.
One more thing which differentiates as RESTful framework from other web-development frameworks is that generally all web-development frameworks rely heavily only on HTTP Get and Post Requests. While a restful leverages the maximum of http protocol. It uses all the types of http request (HEAD, GET, POST, PUT, DELETE).

Advantages of CQ5 over other CMS:
  • Implementation of workflows for creating, editing and publishing of content
  • Managing a repository of digital assets like images, documents and integrating them to the websites.
  • Usage of search queries to find content no matter where it is stored in your organization.
  • Setting up easily the social collaboration blogs, groups.
  • Tagging utility to organize the digital assets such as images.
Resource resolution process in Sling
The below images tells us how a URL is resolved and mapped to a resource.

Consider the URL
Here the type of request will be HTTP GET request
We can break it down into its composite parts:
protocol
host
content path
selector(s)
extension
suffix
param(s)
http://
myhost
products/product1
.printable.a4.
html
/
a/b
?
x=12

From URL to Content and Scripts
Using these principles:
• the mapping uses the content path extracted from the request to locate the resource
• when the appropriate resource is located, the sling resource type is extracted, and used to locate the script to be used for rendering the content


Mapping requests to resources
The request is broken down and the necessary information extracted. The repository is searched for the requested resource (content node):
• first Sling checks whether a node exists at the location specified in the request; e.g. ../
content/corporate/jobs/
developer.html
• if no node is found, the extension is dropped and the search repeated; e.g. ../content/
corporate/jobs/developer
• if no node is found then Sling will return the http code 404 (Not Found).
Note: Sling also allows things other than JCR nodes to be resources, but this is an advanced
feature.
Locating the script
When the appropriate resource (content node) is located, the sling resource type is extracted. This is a path, which locates the script to be used for rendering the content.
The path specified by the sling:resourceType can be either:
• absolute
• relative, to a configuration parameter

What is Reverse Replication?
Reverse replication is used to get user content generated on a publish instance back to the author instance. To do this you need a reverse replication agent in the author environment. This act as the active component to collect information from outbox in the publish environment.
Dialog and Design Dialog:
Dialog is a key element of your component as they provide an interface for authors to configure and provide input to that component. The user input will be stored at page level.
Design dialog will share the content at the template level. Also we can dynamically change the content in design mode.
A design dialog is used to globally store variables through the template properties’ whereas a normal dialog stores all variables inside the page’s properties. One of the major benefits of using the design dialog is that if you have a hundred pages sharing the same template the variables will be shared amongst them. Also, note that you can have both design dialog and normal dialog on a page.
A dialog saves content relative to the given page.  In comparison, a 'design' dialog saves content globally (at a template level).
Within CQ/AEM, there are some key folder structures.  For example, you may have a page at the following path:
/content/your-site/en_HK/your-page
Within that page you have a component that has both a dialog and a design dialog.  When you save content in the dialog, it is saved to a node beneath the given page.  Something like:
 /content/your-site/en_HK/your-page/jcr_content/some-par-sys/your-component
However, when you save content to the design dialog, it is saved to a 'global' design path.  Something like:
               /etc/design/your-site/en_HK/your-template/jcr_content/your-component
This mechanism allows the content that has been saved in the design dialog for the GIVEN TEMPLATE to be available in all templates of the same type (basically).

This can come in handy with headers, footers, and (as the name implies) in cases where a consistent design should be used

Difference between Parsys and Iparsys?
Parsys
iParsys
Parsys is a placeholder where we can drag and drop the component and the script (or content) inside the component will be rendered to that place.     
Iparsys or Inherited Paragraph System is similar to parsys except it allows you to inherit the created paragraphs from the parent.
               Cancel Inheritance : For Child cancel
                Disable Inheritance: From parent Disable 

Explain OSGi[Open Systems Gateway initiative] in CQ5 ?
• Dynamic module system for Java.
• Universal Middleware Category.
• Helps applications to be constructed from small, reusable and collaborative components.
• OSGi bundles can contain compiled Java code, scripts, or any contents to be loaded in the repository.
• Helps the bundles to be loaded, installed.
How bundles are loaded and installed in CQ5?
This is managed by the Sling Management Console of CQ5.
How clustering is done in CQ5?
CQ5 CRX is pre-loaded to run within a cluster,even when running a single instance. Hence the configuration of multi-node clusters with little effort happens in CQ5.
What is the contribution of Servlet Engine in CQ5?
Servlet Engine pretends as a server within which each CQ (and CRX if used) instance runs. Eventhough you can run CQ WCM without an application server, always a Servlet Engine is needed.
Explain the role of Dispatcher in CQ5?
In CQ5 Dispatcher helps to cache and load-balance. The main responsibilities are,
i)    Caching – Cache as much content as possible[ It helps to reduce the continuous functioning of layout engine frequently for generating content when in dynamic.
ii)   Load-balancing – To increase the performance by load-balancing.
State various strategies used by Dispatcher?
i)    Cache as much content as possible as static pages.
ii)   Accessing layout engine as little as possible.
Where does the cache directory exists for CQ5?
The cached documents are created in the root of a web-server which is preconfigured.
Explain the methods of Caching adopted by Dispatcher?
i)    Dispatcher invalidates those pages whose content has been updated and replaces it with new content.
ii)   Auto-Inavidation automatically removes the contents which are not relevant.
How you can inherit properties of one dialog to another dialog ?
For inheriting properties we have to create two components with unique names in the base component dialog. For eg. If your plan is to have two rich text two rich text areas in the dialog of components that inherit from the base, then you must include two rich text areas with unique names in the base component dialog. In any case every input field of a dialog must have a unique name, else they will point to the same property path relative to the jcr:content node of the component when used on a page.
Any issue if name is not unique?
Each input field of a dialog must have a unique name else both will point to the same property path relative to the jcr:content node of the component when used on a page.
Can we restrict for certain users not to display some digital assets ?
You can always limit who can access certain folders in CQ Digital Assets by making the folder part of a CUG(closed user group).
Steps to make a folder part of a CUG:
In CQ DAM, right-click the folder you want to add closed user group properties for and select Properties.
Click the CUG tab.
Select the Enabled check box to make the folder and its assets available only to a closed user group.
Browse to the login page, if there is one, to add that information. Add admitted groups by clicking Add item. If necessary, add the realm. Click OK to save your changes.

Difference between OSGi bundle and Normal Jar file?
1)      OSGi bundles are jar files with metadata inside. Much of this metadata is in the jar’s manifest, found at META-INF/MANIFEST.MF. This metadata, when read by an OSGi runtime container, is what gives the bundle its power.
2)      With OSGi, just because a class is public doesn’t mean you can get to it. All bundles include an export list of package names, and if a package isn’t in the export list, it doesn’t exist to the outside world. This allows developers to build an extensive internal class hierarchy and minimize the surface area of the bundle’s API without abusing the notion of package-private visibility. A common pattern, for instance, is to put interfaces in one package and implementations in another, and only export the interface package.
3)      All OSGi bundles are given a version number, so it’s possible for an application to simultaneously access different versions of the same bundle (eg: junit 3.8.1 and junit 4.0.). Since each bundle has its own class loader, both bundles classes can coexist in the same JVM.
4)      OSGi bundles declare which other bundles they depend upon. This allows them to ensure that any dependencies are met before the bundle is resolved. Only resolved bundles can be activated. Because bundles have versions, versioning can be included in the dependency specification, so one bundle can depend on version junit version 3.8.1 and another bundle depend on junit version 4.0.
5)   In OSGi bundle, there will be an Activator.java class in OSGi which is an optional listener class to be notified of bundle start and stop events.

What is the difference between
1. <c:import url=”layout-link.jsp” /> 
2. <sling:include path=”layout-link.jsp” /> 
3. <cq:include script=”layout-link.jsp” />
What is the advantage of each tag? When should each be used?
CQ Include is most appropriate when you are doing standard component/template development.
Sling include is most appropriate when you are trying to include a piece of content as based strictly on sling resource resolution and not CQ component type logic.
1.     <c:import url=”layout-link.jsp” />
This is the import tag of the Standard Tag Library. This tag is documented athttp://java.sun.com/products/jsp/jstl/1.1/docs/tlddocs/c/import.html and does not know about Sling directly.
But — asuming — this tag is using a RequestDispatcher to dispatch the request, this tag will also pass Sling and the Sling resource resolver.

2. <sling:include path=”layout-link.jsp” />
This is the include tag of the Sling JSP Tag library. This tag knows about Sling and also supportsRequestDispatcherOptions.

3. <cq:include script=”layout-link.jsp” />
This tag is Communiqué specific extension of the Sling JSP Tag library include tag. IIRC it supports callings scripts in addition to just including renderings of resources.
What is the advantage of each tag? When should each be used?
In a Communiqué application, I would suggest to generally use the Communiqué or Sling include tag since this provides you more Sling support.
You may use the JSTL import tag if you don’t have specific requirements for Sling extended features, plan to use the JSP (fragment) outside of Communiqué or if you want to further process the generated (imported) content with a reader or a variable.
In the future, it is conceivable that the Sling and/or Communique tag library will also provide an import tag similar to the JSTL import tag to be able to further process the imported result.http://dev.day.com/discussion-groups/content/lists/cq-google/2009-10/2009-10-06__day_communique_tag_difference_cq_sling_c__zambak.html

Clientlibs
The “clientlib” functionality will manage all your Javascript and CSS resources in your application. It takes cares of dependency management, merging files and minifying content.
Clientlib: CQ Static Resource Management
For every web application performance is an important factor that we usually ignore at first place and down the line it becomes a bottleneck for us. So, performance can be improved by considering various factors while designing a new application, few of them are listed below:
1) Web page sizeA web page is composed of HTML markup, JS files, CSS files and images. We should try to keep page size as low as possible so that page is loaded quickly in browser.
2) Ajax calls v/s full page reload
There are many instances where it’s always better to make an Ajax call to hit the server and update a small area (HTML DOM) of page rather than reloading whole page.
3) Amount of data transfer between server and browser
When we make a call to service on server, the services should only return page/context specific data rather returning whole information/data. We can call server again (via Ajax calls) to fetch limited data and update page accordingly.



CQ Load balancing:
Load Balancing distributes user requests (load) across different clustered CQinstances. The following list describes the advantages for load balancing:
  • In practice this means that the Dispatcher shares document requests between several instances of CQ. Because each instance has fewer documents to process, you have faster response times. The Dispatcher keeps internal statistics for each document category, so it can estimate the load and distribute the queries efficiently.
  • If the Dispatcher does not receive responses from an instance, it will automatically relay requests to one of the other instance(s). Thus, if an instance becomes unavailable, the only effect is a slowdown of the site, proportionate to the computational power lost.

 What is Personalization?
Personalization provides your users with a customized environment that displays dynamic content selected according to their specific needs.
There is an ever-increasing volume of content available today, be it on internet, extranet, or intranet websites.
Personalization centers on providing the user with a tailor-made environment displaying dynamic content that is selected according to their specific needs; be this on the basis of predefined profiles, user selection, or interactive user behavior.
Teaser Component used in Personalization and Segmentation also.

Multi-Site Management?
Multi-Site Management handles multilingual and multinational content, helping your company balance centralized branding with localized content
Parbase
Parsys
Parbase is a key component as it allows components to inherit attributes from other components, similar to subclasses in object oriented languages such as Java, C++, and so on. For example, when you open the/libs/foundation/components/text node in the CRXDE Lite, you see that it has a property named sling:resourceSuperType, which references the parbase component. The parbase here defines tree scripts to render images, titles, and so on, so that all components subclassed from this parbase can use this script.
Also for image component : crop,map etc inheritd
Users do not need access to the parbase.
The paragraph system (parsys) is a compound component that allows authors to add components of different types to a page and contains all other paragraph components. Each paragraph type is represented as a component. The paragraph system itself is also a component, which contains the other paragraph components.

CQ DAM
CQ DAM (Communiqué Digital Asset Management) is used to centrally manage all digital media files and essential metadata information.

Widgets
CQ WCM has been developed using the ExtJS library of widgets.

FileVault (source revision system)
FileVault provides your JCR repository with file system mapping and version control. It can be used to manage CQ development projects with full support for storing and versioning project code, content, configurations and so on, in standard version control systems (for example, Subversion).

Workflow Engine
Your content is often subject to organizational processes, including steps such as approval and sign-off by various participants. These processes can be represented as workflows, defined within CQ, then applied to the appropriate content pages or digital assets as required.
The Workflow Engine is used to manage the implementation of your workflows, and their subsequent application to your content.

Dispatcher
The Dispatcher is the Adobe caching and/or load balancing tool that helps realize a fast and dynamic Web authoring environment. For caching, the Dispatcher works as part of an HTTP server, such as Apache, with the aim of storing (or "caching") as much of the static website content as possible and accessing the website's layout engine as infrequently as possible. In a load balancing role, the Dispatcher distributes user requests (load) across different clustered CQ instances (renders).
For caching, the Dispatcher module uses the Web server's ability to serve static content. The Dispatcher places the cached documents in the document root of the Web server.

Localization
Localization is at the core of CQ5. It provides support for adapting applications, created using the CQ5 platform, into different languages and regional configurations . While processing the request, the Locale is extracted. This is then used to reference a language code, and optionally a country code, which can be used for controlling either the specific content or format of certain output.
Localization will be used throughout CQ5 – wherever reasonable. One notable exception is the system log information of CQ5 itself, this is never localized and always in English.

Template
A Template is used to create a Page and defines which components can be used within the selected scope. A template is a hierarchy of nodes that has the same structure as the page to be created, but without any actual content.
Each Template will present you with a selection of components available for use.
        Templates are built up of Components;
        Components use, and allow access to, Widgets and these are used to render the Content.
  • If you want your template to be displayed in the Create Page dialog when creating a page right under Websites from the Websites console, set the allowedPaths property of the template node to: /content(/.*)?
Templates are used to create Pages of type cq:Page (as mentioned earlier, a Page is a special type of Component). Each CQ Page has a structured node jcr:content. This:
        is of type cq:PageContent
        is a structured node-type holding a defined content-definition
        has a property sling:resourceType to reference the component holding the sling scripts used for rendering the content

Benefit of OSGi Bundle over conventional Java “jar” file

While it is true that you can start and stop a “jar” file, the concept of a bundle expands upon the conventional “jar” file by including metadata such as the version and list of services imported and exported by the bundle. This allows an OSGi bundle to be installed, updated, and uninstalled without taking down the entire application. Also, OSGi bundling allows multiple versions to exist, with the OSGi framework assuming the responsibility of matching “service consumers” with “service providers”. The net result is that an OSGi bundle is more of a standalone “software module” than a “jar”, “war”, or “ear” file.

How does CQ interface with LDAP systems?
CQ interfaces with LDAP systems, such as Apache Directory or Windows Active Directory, using the Java Authentication and Authorization Service (JAAS).

What LDAP information is synchronized with CQ?
Information about users and groups is synchronized between CQ and the LDAP system.
Although LDAP accounts might be assigned to groups, these associations often reflect organizational properties, because the accounts may be applied to multiple applications. It is recommended that you apply CRX groups to the accounts to control permissions within CRX or any dependent applications, such as CQ.
Additional reading:
ñ See this Help page for more information about administering users, groups, and access rights/permissions in CRX.
ñ For CRX user and group administration best practices, see this section.Information about users and groups is synchronized between CQ and the LDAP system.


 How to create a custom widget client-library ?
CQ5 provides an interface to add new JavaScript functionality to the WCM authoring interface through the Ext.JS framework. These so-called client-libraries provide the following:
ñadd custom functionality or
ñoverride and extend existing features
Create custom client-library
The next steps outline what is required to build a custom client-library.
ñGoto CRXDE Lite (e.g. http://<host>:<port>/crxde)
ñCreate a node with type cq:ClientLibraryFolder, e.g.
ñ/apps/myapp/ui/mylib
ñset the String property sling:resourceType to widgets/clientlib
ñset (multi-value) property categories to one or more of the following

     Category Name 
      Description
        cq.wcm.admin
      SiteAdmin and Tools
       cq.wcm.edit
      ContentFinder and edit page
       cq.dam.admin
      DAM Admin
       cq.dam.edit
     DAM AssetShare, AssetEditor

if required, set (multi-value) String property dependencies to other client-libraries in the system

No comments:

Post a Comment